1 Introduction

In this paper, we propose a high order accurate discontinuous Galerkin (DG) method for solving the following problem

$$\begin{aligned} \partial _t u&=\nabla _x\cdot (f(u)\nabla _x (\Phi (x)+H'(u)) ),\quad x\in \Omega ,\; t >0, \end{aligned}$$
(1a)
$$\begin{aligned} u(x, 0)&= u_0(x), \end{aligned}$$
(1b)

subject to appropriate boundary conditions. Here \( u(t, x)\ge 0\) is the unknown, \(\Omega \) is a bounded domain in \(\mathbb {R}^d\), \(H: \mathbb {R}^+ \rightarrow \mathbb {R}\) and \(f: \mathbb {R}^+ \rightarrow \mathbb {R}^+\) are given functions, and \(\Phi (x)\) is a given potential function.

This equation has a gradient flow structure corresponding to the entropy functional

$$\begin{aligned} E=\int _{\Omega } (H(u) +u\Phi (x))dx. \end{aligned}$$

A simple calculation shows that the time derivative of this entropy along the equation (1a) with zero flux boundary condition is

$$\begin{aligned} \frac{d}{dt} E(t) =-\int _{\Omega } f(u)|\nabla _x(\Phi +H'(u))|^2dx \le 0, \end{aligned}$$
(2)

which reveals the entropy dissipation property of the underlying system. Certain entropy dissipation inequalities are recognized to characterize the fine details of the convergence to steady states, see e.g., [7, 9, 11, 24].

Equations such as (1a) appear in a wide range of applications. In the case \(f(u)=u\), the equation becomes

$$\begin{aligned} \partial _t u =\nabla _x\cdot (u\nabla _x (\Phi (x)+H'(u))). \end{aligned}$$
(3)

If \(H'(u)=u^m (m>1)\) and \(\Phi =0\), it is the porous medium equation [11, 24], and for \(H'(u)=\nu u^{m-1}\) and \(\Phi ={x^4}/{4}-{x^2}/{2}\), it is the nonlinear diffusion equation confined by a double-well potential [6]. A particular example with nonlinear f(u) is

$$\begin{aligned} \partial _t u=\nabla _x\cdot (x u(1+ku)+\nabla _x u), \end{aligned}$$
(4)

which is known as a model for fermion (\(k=-1\)) and boson (\(k=1\)) gases [8, 10, 28]. A more general class of the form

$$\begin{aligned} \partial _t u=\nabla _x\cdot \left( x u(1+u^N)+\nabla _x u\right) ,\quad N>2, \end{aligned}$$
(5)

is known to develop finite time concentration beyond some critical mass [1].

In order to capture the rich dynamics of solutions to (1), it is highly desirable to develop high order schemes which can preserve the entropy dissipation law (2) at the discrete level. In this work, we propose such a scheme for (1) using the discontinuous Galerkin discretization.

A related finite volume method was already proposed in [5] for (1), and further generalized to cover the nonlocal terms and general dimension in [6]. For (1) with \(f(u)=u\) and an additional nonlocal interaction term, a mixed finite element method was studied in [4] based on their interpretation as gradient flows in optimal transportation metrics, following the so called JKO formulation, which is a variational scheme proposed by Jordan et al. [13] for linear Fokker–Planck equations. Regarding the use of relative entropy functionals we refer to [2] for the study of the large time behavior of a fully implicit semi-discretization applied to linear parabolic Fokker–Planck type equations in the form of (1) with \(f(u)=u\), \(H=u \mathrm{log} u\). A free energy satisfying finite difference method was proposed in [18] for the Poisson–Nernst–Planck (PNP) equations, which correspond to (1) with \(f=u\), \(H=u\mathrm{log}u\), further coupled with a Poisson equation for governing the potential \(\Phi \). However, these existing schemes are only up to second-order.

An entropy satisfying DG method has been recently developed in [22] for the linear Fokker–Planck equation

$$\begin{aligned} \partial _t u =\nabla _x\cdot (\nabla _x u + u \nabla _x \Phi ), \end{aligned}$$
(6)

which corresponds to (3) with \(H=u log u\). The obtained DG method generalizes and improves upon the finite volume method introduced in [21]. The idea in [22] is to apply the DG discretization to the non-logarithmic Landau formulation of (6),

$$\begin{aligned} \partial _t u =\nabla _x\cdot \left( M\nabla _x \left( \frac{u}{M}\right) \right) , \quad M=e^{-\Phi (x)}, \end{aligned}$$

so that the quadratic entropy dissipation law is satisfied. Again based on this formulation, a third order DG scheme was further developed in [23] to numerically preserve the maximum principle: if \(c_1 \le u_0(x)/M \le c_2\), then \(c_1 \le u(x, t)/M \le c_2\) for all \(t>0\). However, the non-logarithmic Landau formulation does not apply directly to the more general class of equations (1a).

In this work, we construct an arbitrary high order entropy satisfying DG scheme for solving (1). The main idea behind the scheme construction is to apply the DG discretization to the following reformulation

$$\begin{aligned} \partial _t u=\partial _x (f(u)\partial _x q), \quad q =\Phi (x)+H'(u), \end{aligned}$$
(7)

by using a special numerical flux for \(\partial _x q\). The resulting scheme is shown to feature several nice properties: (1) the entropy dissipation law (2) is satisfied at the discrete level; (2) the steady states are shown to be preserved; (3) for the third order scheme applied to the model with a trivial potential, a sufficient condition on the range of flux parameters is rigorously established so that cell averages remain positive at each time step, as long as each cell polynomial is positive at three test points. For the numerical positivity a reconstruction algorithm based on positive cell averages is introduced so that the positivity of cell polynomials is enforced, without destroying the accuracy, at least for smooth solutions. This reconstruction also serves as a limiter imposed upon the numerical solution to suppress spurious oscillations at the solution singularity near zero. For the general case the positivity of cell averages can be achieved by carefully tuning the parameters in the numerical flux, as illustrated in the numerical experiments.

The discontinuous Galerkin (DG) method we discuss in this paper is a class of finite element methods, using a completely discontinuous piecewise polynomial space for the numerical solution and the test functions. One main advantage of the DG method was the flexibility afforded by local approximation spaces combined with the suitable design of numerical fluxes crossing cell interfaces. More general information about DG methods for elliptic, parabolic, and hyperbolic PDEs can be found in the recent books and lecture notes [12, 14, 26, 27]. Following the methodology of the direct discontinuous Galerkin (DDG) method proposed in [19, 20], we adopt a similar numerical flux formula for \(\partial _x q\) in (7). The main feature in the DDG schemes proposed in [19, 20] lies in numerical flux choices for the solution gradient, which involve higher order derivatives evaluated crossing cell interfaces.

The plan of the paper is as follows. In Sect. 2, we present our DG scheme in one dimensional setting. In Sect. 3 we prove several important properties of the scheme, including the semi-discrete entropy dissipation law in Theorem 3.1, the fully-discrete entropy dissipation law in Theorem 3.3, the preservation of positive cell averages for the model with trivial potential in Theorem 3.4, and the preservation of steady states in Theorem 3.5. In Sect. 4, we elaborate various details in numerical implementation, including the reconstruction algorithm, the time discretization, and the spatial numerical results are in Sect. 5, where we verify experimentally the high order spatial accuracy of our scheme and simulate the long-time behavior of numerical solutions. The proposed scheme is applied to several physical models including the porous medium equation, the nonlinear diffusion with a double-well potential, and the general Fokker–Planck equation. The numerical results confirm both the high order of accuracy and the numerical efficiency to capture the large-time asymptotic. Concluding remarks are given in Sect. 6.

2 DG Discretization in Space

In this section, we present our DG scheme for (1). For clarity of presentation, we restrict ourselves to the problem in one spatial dimension. It is straightforward to generalize this construction for Cartesian meshes in multidimensional case.

In one-dimensional setting, let \(\Omega = [a, b]\) be a bounded interval. We divide \(\Omega \) with a mesh

$$\begin{aligned} a=x_{1/2}<x_1<\cdots <x_{N-1/2}<x_N<x_{N+1/2}=b, \end{aligned}$$

and the mesh size \(\Delta x_j=x_{j+1/2}-x_{j-1/2}\), and a family of N control cells \(I_j =(x_{j-1/2}, x_{j+1/2})\) with cell center \(x_j =(x_{j-1/2} +x_{j+1/2})/2\). We denote by \(v^+\) and \(v^-\) the right and left limits of function v, and define

$$\begin{aligned}{}[v]=v^+-v^-, \quad \{v\}=\frac{v^++v^-}{2}. \end{aligned}$$

Define an \(k-\)degree discontinuous finite element space

$$\begin{aligned} V_h= \left\{ v\in L^2(\Omega ), \quad v|_{I_j}\in P^k(I_j), j\in \mathbb {Z}_N \right\} , \end{aligned}$$

where \(P^k(I_j)\) denotes the set of all polynomials of degree at most k on \(I_j\), and \(\mathbb {Z}_r=\{1, \ldots , r\}\) for any positive integer r.

We rewrite Eq. (1) as follows

$$\begin{aligned}&\partial _t u=\partial _x (f(u)\partial _x q), \end{aligned}$$
(8a)
$$\begin{aligned}&q =\Phi (x)+H'(u). \end{aligned}$$
(8b)

The DG scheme is to find \((u_h, q_h)\in V_h \times V_h\) such that for all \(v, r \in V_h\) and \(j\in \mathbb {Z}_N\),

$$\begin{aligned}&\int _{I_j} \partial _t u_{h} vdx =-\int _{I_j} f(u_h)\partial _xq_{h} \partial _x vdx +\{f(u_h)\}\widehat{\partial _xq_{h}}v|_{\partial I_j} +\{f(u_h)\}\partial _xv (q_h-\{q_h\})|_{\partial I_j}, \end{aligned}$$
(9a)
$$\begin{aligned}&\int _{I_j}q_hrdx =\int _{I_j}(\Phi (x) +H'(u_h))rdx. \end{aligned}$$
(9b)

Here

$$\begin{aligned} v|_{\partial I_j}=v(x_{j+1/2}^-) - v(x_{j-1/2}^+), \end{aligned}$$

and \(\widehat{\partial _x q_{h}}\) is the numerical flux, following [20], taken as

$$\begin{aligned} \widehat{\partial _xq_{h}}=\beta _0 \frac{[q_h]}{h} +\{\partial _xq_{h}\}+\beta _1h[\partial _x^2q_{h}], \end{aligned}$$
(10)

where \(h=\Delta x\) for uniform meshes and \(h=(\Delta x_j +\Delta x_{j+1})/2\) at \(x_{j+1/2}\) for non-uniform meshes. Here \(\beta _i, i=0, 1\) are parameters satisfying a condition of the form

$$\begin{aligned} \beta _0>\Gamma (\beta _1), \end{aligned}$$

where \(\Gamma (\beta _1)\) is chosen to ensure certain stability property of the underlying PDE.

Note that if zero-flux boundary conditions of the form \(\partial _x (\Phi (x)+H'(u))=0\) are specified, we simply set q-related terms on the domain boundary to be zero. If a Dirichlet boundary condition for u is given at \(\partial \Omega \), we define the boundary numerical flux (10) in the following way:

$$\begin{aligned} \{f(u_h)\}&= \frac{f(u(a, t)) +f(u_h^+)}{2} \text { if }{x=a}; \quad \frac{f(u_h^-)+f(u(b, t))}{2} \text { if }{x=b}, \end{aligned}$$
(11a)
$$\begin{aligned} {[}q_h]&= \left\{ \begin{array}{ll} q_h^+ - \left( \Phi (a) +H'(u(a, t)) \right) &{}\quad \text { for } x=a, \\ \left( \Phi (b) +H'(u(b, t)) \right) - q_h^- &{}\quad \text { for } x=b, \end{array} \right. \end{aligned}$$
(11b)
$$\begin{aligned} \{\partial _xq_{h}\}&=\partial _xq_{h}^+ \text { if }{x=a}; \quad \partial _xq_{h}^- \text { if }{x=b}, \end{aligned}$$
(11c)
$$\begin{aligned} {[}\partial _x^2q_{h}]&=0. \end{aligned}$$
(11d)

Here the boundary conditions are built into the scheme in such a way that the boundary data are used when available, otherwise the value of the numerical solution in corresponding end cells will be used.

3 Properties of the DG Scheme

In this section, we investigate several desired properties of the semi-discrete DG scheme (9), and its time discretization.

3.1 Entropy Dissipation

We first state the entropy satisfying property of DG scheme (9), using the following notation:

$$\begin{aligned} \Vert q_h\Vert _E^2:= \left[ \sum _{j=1}^N\int _{I_j} f(u_h) |\partial _xq_h|^2dx +\sum _{j=1}^{N-1}\left. \{f(u_h)\} \left( \frac{\beta _0}{h}[q_h]^2 \right) \right| _{x_{j+\frac{1}{2}}} \right] . \end{aligned}$$
(12)

Theorem 3.1

Consider the DG scheme (9) and (10), subject to zero-flux boundary condition. If \(f(u_h)\ge 0\), then the semi-discrete entropy

$$\begin{aligned} E(t)=\sum _{j=1}^N\int _{I_j} (\Phi u_h+H(u_h))dx \end{aligned}$$

satisfies

$$\begin{aligned} \frac{d}{dt} E(t)\le -\gamma \Vert q_h\Vert ^2_E \end{aligned}$$
(13)

for \(\gamma =1-\sqrt{\frac{\Gamma }{\beta _0}} \in (0, 1)\), provided

$$\begin{aligned} \beta _0>\Gamma (\beta _1):= \max _{1\le j\le N-1} \frac{\{f(u_h)\} \left( \{\partial _xq_{h}\} +\frac{\beta _1}{2} h\left[ \partial _x^2q_{h}\right] \right) ^2\Big |_{x_{j+1/2}} }{\frac{1}{2h}\left( \int _{I_j}+\int _{I_{j+1}}\right) f(u_h)|\partial _xq_{h}|^2dx}. \end{aligned}$$
(14)

Proof

Summing (9) and (10) over all index j we obtain a global formulation:

$$\begin{aligned} \int _\Omega \partial _tu_{h}v dx= & {} - \sum _{j=1}^N \int _{I_j} f(u_h) \partial _xq_{h} \partial _xvdx - \sum _{j=1}^{N-1} \{f(u_h)\} \left( \widehat{\partial _xq_{h}} [v]+ \{ \partial _xv\}[q_h]\right) _{j+1/2}, \qquad \end{aligned}$$
(15)
$$\begin{aligned} \int _{\Omega } q_h r dx= & {} \int _\Omega (\Phi +H'(u_h))rdx. \end{aligned}$$
(16)

Taking \(r=\partial _t u_{h}\) in (16), we obtain

$$\begin{aligned} \int _\Omega \partial _tu_{h} q_hdx =\int _\Omega ( \Phi (x)+H'(u_h))\partial _t u_{h}dx=\frac{d}{dt}\int _\Omega (\Phi u_h+H(u_h))dx =\frac{d}{dt}E(t). \end{aligned}$$

The right hand side from taking \(v=q_h\) in (15) becomes

$$\begin{aligned} \frac{d}{dt}E(t)&= -\sum _{j=1}^N\int _{I_j} f(u_h)|\partial _xq_{h}|^2dx -\sum _{j=1}^{N-1} \{f(u_h)\} \left( \widehat{\partial _xq_{h}}[q_h] +\{\partial _xq_{h}\}[q_h]\right) _{j+1/2}\\&= -\sum _{j=1}^N\int _{I_j} f(u_h)|\partial _xq_{h}|^2dx -\sum _{j=1}^{N-1} \{f(u_h)\} \left( \beta _0 [q_h]^2/h +[q_h]( 2\{\partial _xq_{h}\} +\beta _1h[\partial _x^2q_{h}]) \right) _{j+1/2}. \end{aligned}$$

Using Young’s inequality we obtain

$$\begin{aligned} - \left( 2\{\partial _xq_{h}\} + \beta _1h\left[ \partial ^2_xq_{h}\right] \right) [q_h]\le \beta _0(1-\gamma )[q_h]^2/h+ \frac{h}{4\beta _0(1-\gamma )} \left( 2\{\partial _xq_{h}\} +\beta _1h\left[ \partial _x^2q_{h}\right] \right) ^2 \end{aligned}$$

for some \(0<\gamma <1\). Hence

$$\begin{aligned} \frac{d}{dt}E(t)&\le -\gamma \left[ \sum _{j=1}^N\int _{I_j} f(u_h)|\partial _xq_{h}|^2dx +\sum _{j=1}^{N-1}\left( \frac{\{f(u_h)\}\beta _0}{h}[q_h]^2 \right) _{j+1/2} \right] \\ \nonumber&\quad - \left[ (1-\gamma )\sum _{j=1}^N\int _{I_j} f(u_h)|\partial _xq_{h}|^2dx -\sum _{j=1}^{N-1} \frac{h\{f(u_h)\}}{4\beta _0(1-\gamma )} \left( 2\{\partial _xq_{h}\} +\beta _1h\left[ \partial _x^2q_{h}\right] \right) ^2 \right] \\ \nonumber&\le -\gamma \left[ \sum _{j=1}^N\int _{I_j} f(u_h)|\partial _xq_{h}|^2dx +\sum _{j=1}^{N-1}\left( \frac{\{f(u_h)\}\beta _0}{h}[q_h]^2 \right) _{j+1/2} \right] \\ \nonumber&\quad -\frac{1-\gamma }{2}\int _{I_1\cup I_N} f(u_h)|\partial _xq_{h}|^2dx, \end{aligned}$$
(17)

since \(\beta _0\) satisfies (14), hence

$$\begin{aligned} \beta _0(1-\gamma )^2 = \Gamma \ge \frac{\sum _{j=1}^{N-1} h\{f(u_h)\} \left( \{\partial _xq_{h}\} +\frac{\beta _1}{2}h\left[ \partial _x^2q_{h}\right] \right) _{j+1/2}^2}{ \left( \sum _{j=2}^{N-1}\int _{I_j} + \frac{1}{2}\int _{I_1\cup I_N}\right) f(u_h)|\partial _xq_{h}|^2dx}. \end{aligned}$$

This finishes the proof of (13).\(\square \)

Remark 3.1

We remark that a larger, yet simpler, \(\Gamma (\beta _1)\) can be found for sufficiently small h since the variation of ratio \(\frac{\{f\}}{f}\) is also small. Assume that this ratio is bounded by a factor 2, i.e., \( 2 \ge \frac{f}{\{f\}}\ge \frac{1}{2}\), then

$$\begin{aligned} \Gamma (\beta _1)&\le 2 \max _{1\le j\le N-1} \frac{\left( \{\partial _xq_{h}\} +\frac{\beta _1}{2} h\left[ \partial _x^2q_{h}\right] \right) ^2\Big |_{x_{j+1/2}} }{\frac{1}{2h}\left( \int _{I_j}+\int _{I_{j+1}}\right) |\partial _xq_{h}|^2dx} \\&\le 2 \max _{1\le j\le N-1}\frac{\left( \frac{\partial _xq_h^--\beta _1h\partial ^2_xq_h^-}{2}\right) _{x_{j+1/2}}^2+ \left( \frac{\partial _xq_h^++\beta _1h\partial ^2_xq_h^+}{2}\right) ^2_{x_{j+1/2}}}{\frac{1}{2h}\left( \int _{I_j}|\partial _xq_{h}|^2dx +\int _{I_{j+1}}|\partial _xq_{h}|^2dx\right) } \end{aligned}$$

It is clear that this inequality is implied by

$$\begin{aligned} \Gamma (\beta _1) \le 2 \max _{1\le j\le N-1}\left\{ \frac{\left( \partial _xq_h^--\beta _1h\partial _x^2q_h^-\right) ^2}{\frac{1}{2h}\int _{I_j}|\partial _xq_h|^2},\frac{\left( \partial _xq_h^++\beta _1h\partial _x^2q_h^+\right) ^2}{\frac{1}{2h}\int _{I_{j+1}}|\partial _xq_h|^2}\right\} . \end{aligned}$$
(18)

By setting \(v(\xi )=\partial _xq_h\left( x_j+\frac{h}{2}\xi \right) \) for \(q_h(x)|_{I_j}\), and \(v(\xi )=\partial _xq_h\left( x_{j+1}-\frac{h}{2}\xi \right) \) for \(q_h|_{I_{j+1}}\), we have

$$\begin{aligned} \Gamma (\beta _1) \le 2 \sup _{v\in P^{k-1}} \frac{(v(1) -2\beta _1 \partial _\xi v(1))^2}{ \frac{1}{2}\int _{-1}^1 |v|^2d\xi } =2k^2 \left( 1-\beta _1 (k^2-1)+\frac{\beta _1^2}{3}(k^2-1)^2 \right) , \end{aligned}$$

here we have used the exact formula in [15, Lemma3.1]. Hence it suffices to choose \(\beta _0\) such that

$$\begin{aligned} \beta _0>2k^2 \left( 1-\beta _1 (k^2-1)+\frac{\beta _1^2}{3}(k^2-1)^2 \right) . \end{aligned}$$
(19)

Remark 3.2

The positivity of numerical solutions are realized through a reconstruction algorithm at each time step, based on positive cell averages, as detailed in Sect. 4.1. It is shown in Theorem 3.4 that the use of non-zero \(\beta _1\) is crucial in the sense that the positivity of cell averages can be ensured. Indeed, this is proved for the third order DG scheme in solving (1) with zero potential. For the model with non-trivial potential, our numerical experiments again confirm the special role of \(\beta _1\) in the preservation of positivity of numerical cell averages.

3.2 The Fully-Discrete DG Scheme

In order to preserve the entropy dissipation law for \(u_h^n\) at each time step, the time step restriction is needed when using an explicit time discretization. We now discuss this issue by taking the Euler first order time discretization of (9): find \(u_h^{n+1}(x)\in V_h\) such that for any \(r(x), v(x) \in V_h\),

$$\begin{aligned}&\int _{I_j}q_h^n r\,dx =\int _{I_j}\left( \Phi (x) +H'\left( u_h^n\right) \right) r\,dx, \end{aligned}$$
(20a)
$$\begin{aligned}&\int _{I_j}D_t u_h^n v \,dx =-\int _{I_j} f\left( u_h^n\right) \partial _x q_{h}^n \partial _x v\,dx +\{f(u_h^n)\}\left. \left[ \widehat{\partial _x q_h^n }v + \partial _x v \left( q_h^n-\{q_h^n\}\right) \right] \right| _{\partial I_j}.\quad \end{aligned}$$
(20b)

Here and in what follows, we use the notation for any function \(w^n(x)\) as

$$\begin{aligned} D_t w^n =\frac{w^{n+1}-w^n}{\Delta t}, \end{aligned}$$

and \(\mu = \frac{\Delta t}{h^2}\) as the mesh ratio.

Lemma 3.2

The following inverse inequalities hold for any \(v\in V_h\):

$$\begin{aligned}&\sum _{j=1}^N \int _{I_j} v_x^2 dx \le \frac{k(k+1)^2(k+2)}{h^2} \sum _{j=1}^N \int _{I_j}v^2 dx, \end{aligned}$$
(21a)
$$\begin{aligned}&\sum _{j=1}^{N-1} [v]_{j+1/2} \le \frac{4(k+1)^2}{h} \sum _{j=1}^N \int _{I_j}v^2 dx, \end{aligned}$$
(21b)
$$\begin{aligned}&\sum _{j=1}^{N-1} \{v_x\}^2_{j+1/2} \le \frac{k^3(k+1)^2(k+2)}{h^2} \sum _{j=1}^N \int _{I_j}v^2 dx. \end{aligned}$$
(21c)

Proof

These follow from the repeated use of the two inverse inequalities:

$$\begin{aligned}&\max \{|w(a)|, |w(b)|\} \le (m+1)|I|^{-1/2}\Vert w\Vert _{L^2(I)}, \end{aligned}$$
(22a)
$$\begin{aligned}&\Vert \partial _x w\Vert _{L^2(I)}\le (m+1) \sqrt{m(m+2)}|I|^{-1}\Vert w\Vert _{L^2(I)}, \end{aligned}$$
(22b)

provided \(w \in P^m(I)\) with \(I=(a, b)\) and \(|I|=b-a\). The first bound is well known, see e.g. [29]. The second inequality may be found in [17, Lemma3.1]\(\square \)

Theorem 3.3

Let the fully discrete entropy be defined as

$$\begin{aligned} E^n =\sum _{j=1}^N\int _{I_j} \left( \Phi (x) u_h^n(x)+H(u_h^n(x))\right) dx. \end{aligned}$$

The DG scheme (20), subject to zero-flux boundary condition, satisfies

$$\begin{aligned} D_t E^n\le -\frac{\gamma }{2} \Vert q_h^n\Vert _E^2 \end{aligned}$$
(23)

for some \(\gamma \in (0, 1)\), provided \(u_h^n(x)\) remains positive, \(\beta _0>\Gamma (\beta _1)\), and

$$\begin{aligned} \mu \le \frac{\gamma }{C(k, \beta _0, \beta _1)\Vert \max \{0, H''(u_h^n(\cdot ))\}\Vert _\infty \Vert f(u_h^n(\cdot ))\Vert _\infty }, \end{aligned}$$
(24)

where \(C(k, \beta _0, \beta _1)\) is given in (29) below.

Proof

Summing (20) over all index j’s we obtain

$$\begin{aligned} \sum _{j=1}^N\int _{I_j} q_h^n r \,dx&= \sum _{j=1}^N\int _{I_j}\left( \Phi (x) +H'\left( u_h^n\right) \right) r\,dx, \end{aligned}$$
(25)
$$\begin{aligned} \sum _{j=1}^N\int _{I_j} D_t u_h^n v\, dx&= - \sum _{j=1}^N \int _{I_j} f\left( u_h^n\right) \partial _xq_h^n \partial _x v\,dx - \sum _{j=1}^{N-1} \{f\left( u_h^n\right) \} \left. \left( \widehat{\partial _xq^n_h} [v]+ \{ \partial _x v\}\left[ q^n_h\right] \right) \right| _{x_{j+\frac{1}{2}}}. \end{aligned}$$
(26)

Taking \(r =D_t u_h^n \) in (25), we obtain

$$\begin{aligned} \int _\Omega D_t u_h^n q_h^n dx&=\int _\Omega \left( \Phi (x)+H'\left( u_h^n(x)\right) \right) D_t u_h^n\,dx \\&=D_t E^n -\frac{1}{\Delta t} \int _{\Omega } ( H\left( u^{n+1}_h\right) - H\left( u^n_h\right) - H'\left( u_h^n\right) \left( u^{n+1}_h -u_h^n\right) )dx \\&= D_t E^n -\frac{\Delta t}{2} \int _{\Omega }H''(\cdot )\left( D_t u_h^n\right) ^2dx. \end{aligned}$$

Here \((\cdot )\) denotes the intermediate value between \(u_h^n\) and \(u_h^{n+1}\). Taking \(v =q_h^n\), (26) becomes

$$\begin{aligned} \int _\Omega D_t u_h^n q_h^n dx&= -\sum _{j=1}^N\int _{I_j} f\left( u_h^n\right) |\partial _xq_h^n |^2\,dx -\sum _{j=1}^{N-1} \{f\left( u^n_h\right) \} \left[ q_h^n\right] \left. \left( \widehat{\partial _xq_h^n}+\{\partial _xq_h^n\}\right) \right| _{x_{j+\frac{1}{2}}}\\&\le -\gamma \Vert q_h^n\Vert _E^2, \end{aligned}$$

for \(\beta _0\) satisfying (14) at each interface \(x_{j+\frac{1}{2}}\), \(j=1,\ldots , N-1.\) Hence

$$\begin{aligned} D_t E^n\le -\gamma \Vert q_h^n\Vert _E^2 + \frac{\Delta t}{2} \int _{\Omega } H''(\cdot ) \left( D_t u_h^n\right) ^2\,dx. \end{aligned}$$

The claimed estimate follows if

$$\begin{aligned} \Delta t \le \frac{\gamma \Vert q_h^n\Vert _E^2}{\int _{\Omega } \max \{0, H''(\cdot )\} \left( D_t u_h^n\right) ^2\,dx}. \end{aligned}$$
(27)

For convex H, this indeed imposes a time restriction.

It remains to show that the bound in (24) is smaller than the right side of (27). In (26), we take \(v=D_tu^n_h\) and use the Young inequality \(ab \le \frac{1}{4\epsilon }a^2+\epsilon b^2\) to obtain

$$\begin{aligned} \sum _{j=1}^N\int _{I_j} v^2\, dx&= - \sum _{j=1}^N \int _{I_j} f\left( u_h^n\right) \partial _xq_h^n \partial _x v\,dx - \sum _{j=1}^{N-1} \{f\left( u_h^n\right) \} \left. \left( \widehat{\partial _xq^n_h} [v]+ \{ \partial _x v\}\left[ q^n_h\right] \right) \right| _{x_{j+\frac{1}{2}}}\\&\le \frac{1}{4\epsilon _1h^2} \sum _{j=1}^N \int _{I_j} f^2\left( u_h^n\right) |\partial _xq_h^n|^2 \,dx +\epsilon _1 h^2 \sum _{j=1}^N \int _{I_j} |\partial _x v| ^2\,dx\\&\quad + \frac{1}{4\epsilon _2h} \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \}^2 |\widehat{\partial _xq^n_h}|^2 \right| _{x_{j+\frac{1}{2}}} +\epsilon _2h \sum _{j=1}^{N-1}\left. { [v]^2}\right| _{x_{j+\frac{1}{2}}} \\&\quad + \frac{1}{4\epsilon _3h^3} \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \}^2 [q^n_h]^2\right| _{x_{j+\frac{1}{2}}} +\epsilon _3h^3 \sum _{j=1}^{N-1} \left. \{\partial _xv\}^2\right| _{x_{j+\frac{1}{2}}} . \end{aligned}$$

The use of inequalities in (3.2) leads to

$$\begin{aligned}&\epsilon _1 h^2 \sum _{j=1}^N \int _{I_j} |\partial _x v| ^2\,dx + \epsilon _2 h \sum _{j=1}^{N-1} \left. { [v]^2}\right| _{x_{j+\frac{1}{2}}} + \epsilon _3h^3 \sum _{j=1}^{N-1}\left. [\partial _xv]^2 \right| _{x_{j+\frac{1}{2}}} \\&\quad \le (k+1)^2 ( k(k+2) \epsilon _1 + 4\epsilon _2+ k^3(k+2) \epsilon _3) \sum _{j=1}^N\int _{I_j} v^2\, dx \\&\quad = \frac{3}{4} \sum _{j=1}^N\int _{I_j} v^2\, dx, \end{aligned}$$

provided

$$\begin{aligned} (4\epsilon _1)^{-1} =k(k+1)^2(k+2), \; (4\epsilon _2)^{-1} =4 (k+1)^2\,\,\,\, \text {and}\,\,\,\, (4\epsilon _3)^{-1} =k^3(k+1)^2(k+2). \end{aligned}$$

This gives

$$\begin{aligned}&\frac{1}{4}\sum _{j=1}^N\int _{I_j} v^2\, dx \le \frac{k(k+1)^2(k+2)}{h^2}\sum _{j=1}^N \int _{I_j} f^2\left( u_h^n\right) |\partial _xq_h^n|^2 \,dx \\ \nonumber&\qquad +\frac{k^3(k+1)^2(k+2)}{h^3} \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \}^2 \left[ q^n_h\right] ^2\right| _{x_{j+\frac{1}{2}}}+\frac{4(k+1)^2}{h}\sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \}^2 |\widehat{\partial _xq^n_h}|^2 \right| _{x_{j+\frac{1}{2}}}. \end{aligned}$$
(28)

It is clear that the first two terms are bounded by \(\Vert f(u_h^n(\cdot )\Vert _\infty \Vert q_h^n\Vert _E^2\). We now show that the last term is also bounded by \(\Vert f(u_h^n(\cdot )\Vert _\infty \Vert q_h^n\Vert _E^2\), up to constant multiplication factors.

$$\begin{aligned} \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \} |\widehat{\partial _xq^n_h}|^2\right| _{x_{j+\frac{1}{2}}}&= \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \} \left| \{\partial _xq_h^n\}+\beta _0\frac{\left[ q^n_h\right] }{h}+\beta _1h\left[ \partial _x^2q^n_h\right] \right| ^2 \right| _{x_{j+\frac{1}{2}}}\\&\le 2 \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \} \left( \beta _0^2\frac{\left[ q^n_h\right] ^2}{h^2}+\left( \{\partial _xq^n_h\}+\beta _1h\left[ \partial _x^2q^n_h\right] \right) ^2 \right) \right| _{x_{j+\frac{1}{2}}}. \end{aligned}$$

From (14) it follows that

$$\begin{aligned} \left. \{f\left( u_h^n\right) \}\left( \{\partial _xq^n_h\}+\beta _1h\left[ \partial _x^2q^n_h\right] \right) ^2\right| _{x_{j+\frac{1}{2}}} \le \frac{ \Gamma (2\beta _1) }{2h} \left( \int _{I_j}+\int _{I_{j+1}}\right) f(u_h)|\partial _xq_{h}|^2dx. \end{aligned}$$

Hence

$$\begin{aligned} \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \}\left( \{\partial _xq^n_h\}+\beta _1h\left[ \partial _x^2q^n_h\right] \right) ^2\right| _{x_{j+\frac{1}{2}}} \le \frac{ \Gamma (2\beta _1)}{h} \sum _{j=1}^N\int _{I_j}f(u_h)|\partial _xq_{h}|^2dx. \end{aligned}$$

These together yield

$$\begin{aligned} \sum _{j=1}^{N-1} \left. \{f\left( u_h^n\right) \} |\widehat{\partial _xq^n_h}|^2\right| _{x_{j+\frac{1}{2}}} \le \frac{2}{h} \max \{\beta _0, \Gamma (2\beta _1)\} \Vert q^n_h\Vert _E^2. \end{aligned}$$

Upon insertion into (28) we obtain

$$\begin{aligned} \sum _{j=1}^N\int _{I_j} v^2\, dx \le \frac{C(k, \beta _0, \beta _1) ||f(u_h^n(\cdot ))||_\infty }{h^2}\Vert q^n_h\Vert _E^2, \end{aligned}$$

where

$$\begin{aligned} C(k, \beta _0, \beta _1):= 4(k+1)^2 \left( k(k+2)\max \{1, k^2/\beta _0\}+ 8\max \{\beta _0, \Gamma (2\beta _1)\} \right) . \end{aligned}$$
(29)

Hence (27) is implied by (24).

This ends the proof.\(\square \)

3.3 Preservation of Positive Cell Averages

It is known to be difficult, if not impossible, to preserve point-wise solution bounds for high order numerical approximations. A popular strategy after the work [30] is to combine an accuracy preserving reconstruction with the bound preserving property of cell averages. For the DG scheme applied to (1) with \(\Phi =0\), following [23], we are able to identify a range of \(\beta _1\) so that positive cell averages are ensured for at least the third order scheme. We have not been able to prove this property for the general case.

By taking the test function \(v = 1\) on \(I_j\) in (20b), we obtain the evolutionary equation for the cell average,

$$\begin{aligned} \bar{u}^{n+1}_j= \bar{u}^n_j + \mu h\left. \{f\left( u_h^n\right) \} \widehat{\partial _x q_h^n}\right| _{\partial I_j}. \end{aligned}$$
(30)

For the case that H is convex and \(\Phi (x)=0\), we reformulate (8) as

$$\begin{aligned} \partial _t u=\partial _x (fH''\partial _x q), \quad q=u. \end{aligned}$$

At the discrete level, we simply set \(q_h=u_h\) and replace f by \(fH''\) in (20b). Assuming that \( \bar{u}^n_j \in [c_1, c_2]\) for all j’s, we can derive some sufficient conditions such that \( \bar{u}^{n+1}_j \in [c_1, c_2] \) under certain CFL condition on \(\mu \).

For piecewise quadratic polynomials, we have the following result.

Theorem 3.4

(\(k=2\)) The scheme (30) with \(q_h=u_h\), and

$$\begin{aligned} \frac{1}{8}< \beta _1 < \frac{1}{4} \quad \text {and} \quad \beta _0 \ge 1 \end{aligned}$$
(31)

is bound preserving, namely, \(\bar{u}_{j}^{n+1}\in [c_1, c_2]\) if \(u_h^n(x) \in [c_1, c_2]\) on the set \(S_j\)’s where

$$\begin{aligned} S_j = x_j +\frac{h}{2}\left\{ -1, 0, 1\right\} , \end{aligned}$$

under the CFL condition

$$\begin{aligned} \mu \le \mu _0= \frac{1}{12\max _{1\le j\le N} |f(u_{j-1/2}^n)|} \min \left\{ \frac{1}{\beta _0 +8\beta _1-2 }, \frac{1}{ 1-4\beta _1} \right\} . \end{aligned}$$
(32)

Proof

Let

$$\begin{aligned} p(\xi )=u_h\left( x_j+\frac{h}{2}\xi \right) \text { for } \xi \in [-1,1], \quad \text {i.e.}, \quad p =u_h|_{I_j}, \end{aligned}$$

we have

$$\begin{aligned} \bar{u}_j = \frac{1}{6} p(-1)+ \frac{2}{3}p(0)+ \frac{1}{6} p(1). \end{aligned}$$
(33)

In what follows we denote \(p_-=u_h|_{I_{j-1}}\) and \(p_+=u_h|_{I_{j+1}}\).

We represent the diffusion flux in terms of solution values over the set \(S_j\); see [23].

$$\begin{aligned} h \left. \widehat{\partial _xu_h}\right| _{x_{j+\frac{1}{2}}} = \alpha _3 p_{+}(-1)+\alpha _2 p_{+}(0)+\alpha _1p_{+}(1)- \left( \alpha _1 p(-1)+\alpha _2 p(0)+\alpha _3p(1)\right) , \end{aligned}$$
(34)

where

$$\begin{aligned} \alpha _1= \frac{8\beta _1-1}{2}, \quad \alpha _2 =2(1-4\beta _1), \quad \alpha _3=\beta _0 +\frac{8\beta _1-3}{2}. \end{aligned}$$
(35)

It is easy to verify that (31) ensures \(\alpha _i \ge 0\) for \(i=1, 2, 3\).

Upon substitution into (30) we obtain

$$\begin{aligned} \bar{u}_j^{n+1}&=\, \bar{u}_j + 2 \mu \left( \left. h \{f(u_h)\} \widehat{\partial _xu_h}\right| _{x_{j+\frac{1}{2}}} - \left. h \{f(u_h)\} \widehat{\partial _x u_h}\right| _{x_{j-\frac{1}{2}}}\right) \nonumber \\&=\,\left[ \frac{1}{6} - 2\mu \left( \alpha _3 f_{j-\frac{1}{2}} +\alpha _1 f_{j+\frac{1}{2}} \right) \right] p(-1) \nonumber \\&\quad +\left[ \frac{2}{3} - 2\mu \left( \alpha _2 f_{j-\frac{1}{2}} +\alpha _2 f_{j+\frac{1}{2}}\right) \right] p(0)\nonumber \\&\quad +\left[ \frac{1}{6} - 2\mu \left( \alpha _1 f_{j-\frac{1}{2}} +\alpha _3 f_{j+\frac{1}{2}} \right) \right] p(1) \nonumber \\&\quad +2\mu f_{j+\frac{1}{2}}\left[ \alpha _3p_{+}(-1)+\alpha _2p_{+}(0)+\alpha _1p_{+}(1) \right] \nonumber \\&\quad +2\mu f_{j-\frac{1}{2}}\left[ \alpha _1 p_-(-1)+\alpha _2p_-(0)+\alpha _3p_-(1) \right] . \end{aligned}$$
(36)

Here we have used the notation

$$\begin{aligned} f_{j+\frac{1}{2}}:= \left. \{f(u_h)\}\right| _{x_{j+\frac{1}{2}}} = \left. \frac{f(u_h^-)+f(u_h^+)}{2}\right| _{x_{j+\frac{1}{2}}}. \end{aligned}$$

Note that the sum of all coefficients of above polynomial values is one. Hence \(\bar{u}^{n+1}_j\in [c_1, c_2]\) as long as \(u_h^n\in [c_1, c_2]\) on \(S_j\) and all coefficients are nonnegative. The nonnegativity imposes a CFL condition \(\mu \le \mu _0\) with \(\mu _0\) being

$$\begin{aligned} \frac{1}{12} \min _{1\le j\le N}\left\{ \frac{1}{\alpha _3 f_{j-\frac{1}{2}} + \alpha _1 f_{j+\frac{1}{2}}}, \frac{4}{\alpha _2 f_{j-\frac{1}{2}} + \alpha _2 f_{j+\frac{1}{2}}}, \frac{1}{\alpha _1 f_{j-\frac{1}{2}} + \alpha _3 f_{j+\frac{1}{2}}} \right\} . \end{aligned}$$

Here we assume that \(f_{N+1/2}=0\) so that \(j=N\) can be included in the above expression. It suffices to take smaller

$$\begin{aligned} \mu _0= \frac{1}{12 \max _{ }|f(u^n_{j-1/2})|} \min \left\{ \frac{1}{\alpha _3 + \alpha _1}, \frac{2}{\alpha _2} \right\} . \end{aligned}$$

That is (32), as claimed.\(\square \)

Remark 3.3

The CFL condition (32) is sufficient conditions rather than necessary to preserve the bound of solutions. Therefore, in practice, these CFL conditions are strictly enforced only in the case the bound preserving property is violated.

Remark 3.4

For general case, we expect there is still a proper set of parameters \((\beta _0,\beta _1)\) with which the scheme can preserve positivity of cell averages. Our numerical simulations in Example 2 confirm this expectation.

3.4 Preservation of Steady States

If we start with an initial data \(u_h^0\), already at steady states, i.e., \(\Phi (x) +H'(u_h^0(x))=C\), it follows from (20a) that \(q_h^0=C\). Furthermore, (20b) implies that \(u_h^1=u_h^0\in V_h\). By induction we have

$$\begin{aligned} \Phi (x) +H'(u_h^n(x))=C\quad \forall n\in \mathbb {N}. \end{aligned}$$

This says that the DG scheme (20a) preserves the steady states. Moreover, we can show that in some cases the numerical solution tends asymptotically toward a steady state, independent of initial data. More precisely, we have the following result.

Theorem 3.5

Let the assumptions in Theorem 3.3 be met, and \((u_h^n, q_h^n) \) be the numerical solution to the fully discrete DG scheme (20), then the limits of \((u_h^n, q_h^n)\) as \(n\rightarrow \infty \) satisfy

$$\begin{aligned} q^*_h = C, \quad \Phi (x) +H'(u^*_h) \in C+ V_h^\bot , \end{aligned}$$

where C is a constant. For quadratic H(u), C can be determined explicitly by

$$\begin{aligned} C=\frac{1}{|\Omega |}{\int _{\Omega } (\Phi (x) + H'(u_0)(x))dx}. \end{aligned}$$

In addition, if \(\Phi (x)\in P^m (m \le k)\), then we must have \(\Phi (x) +H'(u^*_h(x)) \equiv C\).

Proof

Since \(E^n\) is non-increasing and bounded from below, we have

$$\begin{aligned} \lim _{n \rightarrow \infty } E^n=\inf \{E^n\}. \end{aligned}$$

Observe from (23) that

$$\begin{aligned} E^{n+1} -E^{n} \le -\frac{\gamma \Delta t}{2} \Vert q_h^n\Vert _E^2 \le 0. \end{aligned}$$

When passing the limit \(n\rightarrow \infty \) we have \(\lim _{n\rightarrow \infty } \Vert q_h^n\Vert _E^2=0\). This implies that each term in this energy norm must have zero as its limit, that is

$$\begin{aligned} \lim _{n\rightarrow \infty } \sum _{j=1}^N \int _{I_j} f(u_h^n) |\partial _x q_h^n|^2dx=0, \quad \lim _{n\rightarrow \infty } \sum _{j=1}^{N-1} \frac{\beta _0}{h}\{f(u_h^n)\}[q_h^n]^2 \Big |_{j+\frac{1}{2}}=0. \end{aligned}$$
(37)

The first relation in (37) tells that the limit of \(q_h^n\), denoted by \(q_h^*\), must be constant in each computational cell. The second relation in (37) infers that \(q_h^*\) must be a constant in the whole domain. These when inserted into (20a) gives the desired result. For quadratic H(u), we use the mass conservation \(\int _{\Omega } H'(u^*_h(x))dx=\int _{\Omega } H'(u_0(x))dx\) to determine the constant C. The proof is complete.\(\square \)

Remark 3.5

The above result shows that for quadratic H(u) and potential \(\Phi (x)\) being polynomials of degree up to k, the steady states are approached by numerical solutions. For other cases, such asymptotic convergence holds only in the projection sense.

4 Numerical Implementation

In this section, we provide further details in implementing the entropy satisfying discontinuous Galerkin (ESDG) method.

4.1 Reconstruction

For a high order polynomial approximation, numerical solutions can have negative values. We enforce the solution positivity through some accuracy-preserving reconstruction. Motivated by the definite result on the bound preserving property of cell averages for special cases in Theorem 3.4, we consider the case with positive cell averages.

Let \(w_h \in P^k(I_j)\) be an approximation to a smooth function \(w(x) \ge 0\), with cell averages \(\bar{w}_j>\delta \) for \(\delta \) being some small positive parameter or zero. We then reconstruct another polynomial in \(P^k(I_j)\) so that

$$\begin{aligned} \tilde{w}_h^{\delta }(x)= \bar{w}_j+\frac{\bar{w}_j-\delta }{\bar{w}_j-\min _{I_j} w_h(x)} (w_h(x)-\bar{w}_j),\quad \text { if } \min _{I_j} w_h(x)<\delta . \end{aligned}$$
(38)

This reconstruction maintains same cell averages and satisfies

$$\begin{aligned} \min _{I_j} w^\delta (x)\ge \delta . \end{aligned}$$

It is known that enforcing a maximum principle numerically might damp oscillations in numerical solutions, see, e.g. [16, 30]. Numerical example in Fig.1 confirms such a damping effect near zero from using the positivity preserving limiter (38).

Fig. 1
figure 1

Capturing singularity in the exact solution at \(t=0.5\)

Lemma 4.1

If \(\bar{w}_j>\delta \), then the reconstruction satisfies the estimate

$$\begin{aligned} |w^{\delta }(x)-w_h(x)|\le C(k) \left( ||w_h(x)-w(x)||_\infty + \delta \right) ,\quad \forall x\in I_j, \end{aligned}$$

where C(k) is a constant depending on k. This says that the reconstructed \(w^{\delta }(x,t)\) in (38) does not destroy the accuracy when \(\delta <h^{k+1}\).

Proof

We have

$$\begin{aligned} |w^\delta (x)-w_h(x)|&= \left| \frac{\delta - \min _{I_j} w_h(x)}{\bar{w}_j-\min _{I_j} w_h(x)} (\bar{w}_j-w_h(x)) \right| \\&\le \frac{\max _{I_j} |\bar{w}_j -w_h(x)|}{\max _{I_j} (\bar{w}_j-w_h(x))} \left( ||w_h(x)-w(x)||_\infty + \delta \right) . \end{aligned}$$

It follows from [23, 30] that

$$\begin{aligned} \frac{\max _{I_j} |\bar{w}_j-w_h(x)|}{\max _{I_j} (\bar{w}_j-w_h(x))} \le C(k), \end{aligned}$$

where k is the degree of the polynomial \(w_h(x)\).\(\square \)

4.2 Time Discretization

For the time discretization of (9), we use the explicit high order Runge–Kutta method. The explicit time discretization is simple to implement, with entropy dissipation law still preserved under some restriction on the time step.

Let \(\{t^n\}, n=0, 1,\ldots \) be a uniform partition of time interval. Denote \(u_h^n \sim u(t_n, x)\), \(q_h^n \sim q(t_n, x)\), where \(t_n=n\Delta t\) and \(\Delta t\) is the uniform temporal step size. The algorithm can be summarized in following steps.

  1. 1.

    Project \(u_0(x)\) onto \(V_h\) to obtain \(u_h(0)\) and solve (9b) to obtain \(q_h(0)\).

  2. 2.

    Solve (9a) to obtain \(u_h^{n+1}\) with a Runge–Kutta (RK) ODE solver. Perform reconstruction (38) if needed.

  3. 3.

    Solve (9b) to obtain \(q_h^{n+1}\) from the obtained \(u_h^{n+1}\).

  4. 4.

    Repeat steps 2 and 3 until final time T.

In our numerical simulation we choose \(\Delta t=C(k)h^2\), where C(k) is smaller for larger k. For the case with zero potential and \(k=2\), C(k) is given in Theorem 3.4. The choice of the time step \(\Delta t \sim h^2\) suggests that we adopt an \(m^{th}\) order Runge–Kutta solver with \(m\ge (k+1)/2\), so that in the accuracy test the temporal error is smaller than the spatial error. For polynomials of degree \(k=1,2,3\), we use the second order explicit Runge–Kutta method (also called Heun’s method) to solve the ODE system \(\dot{a}=\mathfrak {L}(\mathbf a )\):

$$\begin{aligned} \mathbf{a ^{(1)}}&= \mathbf a ^n + \Delta t \mathfrak {L}(\mathbf a ^n),\\ \mathbf a ^{n+1}&= \frac{1}{2}{} \mathbf a ^n + \frac{1}{2}{} \mathbf a ^{(1)}+ \frac{1}{2} \Delta t\mathfrak {L}( \mathbf{a ^{(1)}}). \end{aligned}$$

The bound preserving property for cell averages in Theorem 3.4, depending on a convex combination of polynomial values in previous time step, works well with the above Runge–Kutta solver since it is simply a convex combination of the forward Euler.

4.3 Spatial Discretization

In this section, we present some further details on the spatial discretization. The kth order basis functions in a 1-D standard reference element \(\xi \in [-1, 1]\) are taken as the Legendre polynomials \(\{L_i(\xi )\}_{i=0}^k\), then the numerical solutions in each cell \(x\in I_j\) can be expressed as

$$\begin{aligned} u_h(x, t) =\sum _{i=0}^ku_j^i(t) L_i(\xi )=: L^\top (\xi )u_j(t), \quad q_h(x, t) =\sum _{i=0}^kq_j^i(t)L_i(\xi )=: L^\top (\xi )q_j(t), \end{aligned}$$

using a uniform mesh size h and the map \(x=x_j+\frac{h}{2}\xi \), with notation \(L^\top =(L_0, L_1, \cdots , L_k)\) and \(u_j=(u_j^0, \ldots , u_j^k)^\top \).

For given \(\Phi (x)\), a simple calculation of (9a) with \(v=L(\xi )\) gives

$$\begin{aligned} M \dot{u}_j=\frac{2}{h} R_1+\frac{1}{2h}( R_2+R_3), \quad 2\le j \le N-1, \end{aligned}$$
(39)

where

$$\begin{aligned} M&= \frac{h}{2}\int _{-1}^1 L(\xi )L^\top (\xi )d\xi ,\\ R_1&=- \sum _{i=1}^{Q} \omega _i f\left( L^\top (s_i) u_j(t) \right) L_\xi ^\top (s_i)q_j L_\xi (s_i),\\ R_2&=\left( f\left( L^\top (1)u_j\right) +f\left( L^\top (-1)u_{j+1}\right) \right) \left( - D^\top q_j+ E^\top q_{j+1}\right) L(1) \\&\qquad - \left( f\left( L^\top (1)u_{j-1}\right) +f\left( L^\top (-1)u_j\right) \right) \left( - D^\top q_{j-1}+ E^\top q_{j}\right) L(-1)=R_2^+-R_2^-, \\ R_3&=\left( f\left( L^\top (1)u_j\right) + f\left( L^\top (-1)u_{j+1}\right) \right) \left( L^\top (1) q_j- L^\top (-1)q_{j+1}\right) L_\xi (1)\\&\qquad + \left( f\left( L^\top (1)u_{j-1}\right) + f\left( L^\top (-1)u_{j}\right) \right) \left( L^\top (1) q_{j-1}- L^\top (-1)q_{j}\right) L_\xi (-1)\\&\qquad =: R_3^+ + R_3^-. \end{aligned}$$

Here

$$\begin{aligned} D=\beta _0L(1)-L_\xi (1)+4\beta _1L_{\xi \xi }(1), \quad E =\beta _0L(-1)+L_\xi (-1)+4\beta _1L_{\xi \xi }(-1). \end{aligned}$$

In the evaluation of \(R_1\), we choose Q Gaussian quadrature points \(s_i \in [-1, 1]\) with \(1\le i\le Q\). Here and in what follows, we choose Q quadrature points with \(Q\ge \frac{k+2}{2}\) so that the quadrature rule with accuracy of order \(\mathcal {O}(h^{2Q})\) does not destroy the scheme accuracy. At two end cells, if the zero flux conditions are specified, we use \(R_2=R_2^+, R_3=R_3^+\) for \(j=1\) and \(R_2=-R_2^-, R_3= R_3^-\) for \(j=N\).

If Dirichlet boundary conditions, u(a) and u(b), are specified, we modify \(R_2\) and \(R_3\) according to (11). That is, for \(j=1\),

$$\begin{aligned} R_2&=R_2^+ - (f(u(a)) +f(L^\top (-1)u_{1}))\left[ \beta _0(L^\top (-1)q_1-\Phi (a)-H'(u(a))) \right. \\&\quad \left. + 2L_\xi ^\top (-1)q_1 \right] L(-1), \\ R_3&=R_3^+ + (f(u(a)) +f(L^\top (-1)u_{1})) \left[ \Phi (a)+H'(u(a))-L^\top (-1)q_1\right] L_\xi (-1), \end{aligned}$$

and for \(j=N\),

$$\begin{aligned} R_2&=(f(L^\top (1)u_N) +f(u(b))) \left[ -\beta _0(L^\top (1)q_N-\Phi (b)-H'(u(b))) \right. \\&\quad \left. + 2L_\xi ^\top (1)q_N \right] L(1)-R_2^-, \\ R_3&= f(L^\top (1)u_N) +f(u(b))) \left[ L^\top (1)q_N-\Phi (b)-H'(u(b))\right] L_\xi (1)+ R_3^-. \end{aligned}$$

To solve (9b) is, using the Q-point Gauss quadrature rule on the interval \((-1, 1)\), to solve

$$\begin{aligned} M q_j=\frac{h}{2} \sum _{i=1}^{Q} \omega _i (\Phi (x(s_i))+H'(L^\top (s_i)u_j))L(s_i). \end{aligned}$$
(40)

The collection of (39) and (40) with \(1\le j\le N\) forms a nonlinear ODE system, for which we use a Runge–Kutta method.

5 Numerical Tests

In this section, we present a selected set of numerical examples in order to numerically validate our ESDG scheme. Via several physical models from different applications, we examine the order of accuracy by numerical convergence tests, while we quantify \(l_1\) errors defined by

$$\begin{aligned} \Vert u_h-u_{ref}\Vert _{l_1}= \sum _{j=1}^N \int _{I_j}|u_h(x) - u_{ref}(x)|dx, \end{aligned}$$

with the integral on \(I_j\) evaluated by a 4-point Gaussian quadrature method and \(u_{ref}\) being a reference solution obtained by using a refined mesh size. It is also demonstrated that the scheme captures well the long-time behavior of underlying solutions, as well as the mass concentration phenomenon in certain applications.

5.1 Porous Medium Equation

We consider the porous medium equation of the form

$$\begin{aligned} \partial _t u =\partial _x^2 (u^m), \quad m>1. \end{aligned}$$
(41)

With this model we will illustrate 1) the scheme’s capability in capturing the solution singularity; 2) the positivity preservation proved in Theorem 3.4.

Example 1

Capturing singularity. Barenblatt and Pattle independently found an explicit solution of (41) when the Dirac delta function is used as initial condition [3, 25]. A special explicit solution which we will use is

$$\begin{aligned} B_m(x, t)=\max \left\{ 0, t^{-\alpha }\left( 0.2-\frac{\alpha (m-1)}{2m}\frac{|x|^2}{t^{2\alpha }} \right) ^{\frac{1}{m-1}}\right\} ,\quad \alpha =\frac{1}{m+1}. \end{aligned}$$
(42)

We compute the solution of (41) with initial data \(u_0(x)=B_2(x, 0.1)\), with zero flux boundary conditions \(\partial _x u(\pm 2, t)=0\).

Figure 1 shows the exact solution and \(P^2\) numerical solutions without and with reconstruction (38) with \(\delta \) set to be 0. This reconstruction is not applied to the cells where the \(u_h\) are entirely zero. The scheme with reconstruction gives sharp resolution of expanding fronts, keeping the solution strictly within the initial bounds. The scheme without reconstruction brings visible undershoots near the foot of the numerical solution.

Figure 2 shows a numerical comparison for polynomials with different degrees, \(k=1,2,3\). Cell averages are shown in Fig. 2 (left) and cell polynomials in Fig. 2(right) (zoomed near singularity), we can clearly see that a higher order method gives a more accurate approximation.

Fig. 2
figure 2

Comparison of solutions for \(k=1,2,3\)

Example 2

Positivity preservation. In this example we test the effect of using different parameter \(\beta _1\) in terms of the positivity preservation. Equation (41) with \(m=2\), when written in the form

$$\begin{aligned} \partial _tu=\partial _x(f(u)\partial _xq),\quad f(u)=2u,\quad q=u, \end{aligned}$$

satisfies the requirements in Theorem 3.4. We consider positive initial data with small amplitude,

$$\begin{aligned} u_0(x)=\epsilon (1+30e^{-25x^2}),\quad x\in [-1,1], \end{aligned}$$

and zero flux boundary conditions \(\partial _x u(\pm 1, t)=0\). With \(\epsilon =10^{-5}\), \(\delta =10^{-10}\), \(h=0.2\), \(k=2\) and \(\Delta t=0.25h^2\) in the simulation, our results indicate that cell average \(\bar{u}\) remains above \(\delta \) at \(t=1000\) when using \((\beta _0, \beta _1)=(2, 1/6)\); while \(\bar{u}\) already becomes negative at \(t=41.388\) when taking \((\beta _0, \beta _1)=(2, 0)\). This is consistent with the conclusion in Theorem 3.4 that \( \beta _1\in (1/8, 1/4)\) is sufficient for positivity preservation of cell averages, and for any other \(\beta _1\)’s such a property is not guaranteed. We note here that the range of \(\beta _1\) in Theorem 3.4 is only sufficient. Our simulation also indicates that cell average \(\bar{u}\) still remains above \(\delta \) at \(t=1000\) when using \((\beta _0, \beta _1)=(2, 1/2)\), which does not satisfy the requirement in Theorem 3.4.

We further test the special effect of parameter \(\beta _1\) on the positivity preservation for the case with nontrivial potential, \(\Phi =30\epsilon x^2/2\), i.e., we have

$$\begin{aligned} \partial _t u = \partial _x(f(u)\partial _xq),\quad f(u)=2u,\quad q=u+30\epsilon x^2/2. \end{aligned}$$

Though Theorem 3.4 is no longer applicable due to the nonzero potential, we still see similar effects of \(\beta _1\) through numerical experiments. With the same initial condition and parameters as above, our simulation results in Table 1 show that there is a range for \(\beta _1\) in which \(\bar{u}\) remains above \(\delta \) at \(t=1000\); while \(\bar{u}\) becomes negative at \(t<1000\) when \(\beta _1\le 1/6\) or \(\beta _1\ge 2\). This observation indicates that 1) \(\beta _1\) plays a special role for the positivity preservation; 2) the admissibility of \(\beta _1\) depends on the underlying problem.

Table 1 Time when \(\bar{u}\) becomes negative

5.2 Porous Medium Equation with Linear Convection

We consider the following porous medium equation with linear convection

$$\begin{aligned} \partial _ t u =\partial _x^2 (u^m) + \partial _x u, \quad m>1. \end{aligned}$$

This equation corresponds to (1a) with \(f(u)=u\), \(\Phi =x\) and \(H=\frac{u^m}{m-1}\), and has a wide range of applications. With this model equation we shall test the numerical convergence and the scheme accuracy. We note that the case \(m=2\) was tested in [5] with a second order finite volume scheme.

Example 3

(m=2). We consider

$$\begin{aligned} \partial _ t u =\partial _x^2(u^2)+\partial _xu, \end{aligned}$$

with initial data

$$\begin{aligned} u_0(x) = 0.5 + 0.5 \sin (\pi x), \quad x \in [-1, 1], \end{aligned}$$

subject to zero-flux boundary condition, that is \(\partial _xu(\pm 1, t)=-\frac{1}{2}\). In Table 2 we observe that the orders of convergence are of \(\mathcal {O}(h^{k+1})\) for polynomials of degree k (\(k=1,2,3\)).

Table 2 Error table for porous media equation with \(m=2\) at \(t=1\)

Example 4

(m=3). We further test the case \(m=3\), i.e.,

$$\begin{aligned} \partial _ t u =\partial _x^2(u^3)+\partial _xu, \end{aligned}$$

with initial data

$$\begin{aligned} u_0(x) = 1+ 0.5 \sin (\pi x), \quad x \in [-1, 1], \end{aligned}$$

subject to zero-flux boundary conditions \((uu_x)(\pm 1, t)=-1/3\). The numerical convergence test is performed with the same flux parameters for each k as in the previous example, both errors and orders of convergence are given in Table 3. These results further confirm the \((k+1)\)-th order of accuracy when using \(P^{k} (k=1, 2, 3)\) elements.

Numerical tests in Examples 3 and 4 also indicate that cell averages can be made positive at each time step when choosing proper parameters \((\beta _0, \beta _1)\), together with reconstruction (38) performed at each time step.

Table 3 Error table for porous medium equation with \(m=3\) at \(t=1\)

5.3 Nonlinear Diffusion with a Double-Well Potential

Consider a nonlinear diffusion equation with an external double-well potential of the form

$$\begin{aligned} \partial _t u=\partial _x (u\partial _x (\nu u^{m-1}+\Phi )), \quad \Phi =\frac{x^4}{4}-\frac{x^2}{2}. \end{aligned}$$

This model equation is taken from [6], and it corresponds to system (1) with \(H'(u)=\nu u^{m-1}\). With this model we shall test both numerical accuracy and the asymptotic behavior of numerical solutions.

Example 5

Free energy decay. In this example, we take \(\nu =1\), \(m=2\) and initial data

$$\begin{aligned} u_0(x)=\frac{0.1}{\sqrt{0.4\pi }}e^{-\frac{x^2}{0.4}}, \quad x\in [-2, 2], \end{aligned}$$

subject to zero-flux boundary conditions \(\partial _x u(\pm 2, t)=\mp 6\). Both errors and orders of convergence are given in Table 4, which again demonstrates \(\mathcal {O}(h^{k+1})\) order of accuracy for \(P^k\) polynomials.

We also examine the decay of the entropy

$$\begin{aligned} E=\int _{-2}^2 \left( \Phi (x) u +H (u)\right) dx =\int ^2_{-2} \left[ \left( \frac{x^4}{4}-\frac{x^2}{2}\right) u+ \frac{u^2}{2}\right] dx. \end{aligned}$$

Figure 3 (left) shows the semilog plot of the free energy decay until final time \(T=40\), and Fig. 3 (right) displays the snapshots of u at different times, showing the time-asymptotic convergence of the numerical solutions towards the steady states.

Table 4 Error table for nonlinear diffusion with a double-well potential at \(t=1\)
Fig. 3
figure 3

Entropy decay and solution snapshots of nonlinear diffusion with a double well potential

5.4 The Nonlinear Fokker–Planck Equation

We consider the following model for boson gases,

$$\begin{aligned} \partial _t u=\partial _x (x u(1+u^3)+\partial _x u), \quad t>0, \end{aligned}$$
(43)

which is a nonlinear Fokker–Planck equation corresponding to (1a) with

$$\begin{aligned} \Phi =\frac{x^2}{2},\quad f(u)=u(1+u^3), \quad H'(u)=\log \frac{u}{\root 3 \of {1+u^3}}. \end{aligned}$$

This model equation exhibits the critical mass phenomenon (see [1]), that solutions with initial data of large mass blow-up in finite time, whereas solutions with initial data of small mass do not. The authors in [5] numerically verified such critical mass phenomenon using a second order finite volume scheme. With our high order DG scheme, we test the critical mass phenomenon for (43) with initial data

$$\begin{aligned} u_0(x)=\frac{M}{2\sqrt{2\pi }}\left( \exp \left( -\frac{(x-2)^2}{2}\right) + \exp \left( -\frac{(x+2)^2}{2}\right) \right) , \end{aligned}$$

which has total mass M. This is to illustrate the good performance of the ESDG scheme in capturing complex physical phenomena.

Example 6

Sub-critical mass \(M=1\) and super-critical mass \(M=10\). We test the sub-critical mass \(M=1\) with results in Fig. 4 (left) and super-critical mass \(M=10\) with results in Fig. 4 (right) by \(P^2\) polynomial approximations. These results are consistent with the theoretical conclusion made in [1] and the numerical observation in [5], yet our scheme can produce numerical solutions with higher order of accuracy. Note that the reconstruction (38) has to be implemented due to the involvement of \(\log \)-function in \(H'(u)\).

Fig. 4
figure 4

Dynamics of the general Fokker–Planck equation

6 Concluding Remarks

In this article, we have developed an entropy satisfying DG method for solving nonlinear Fokker–Planck equations with a gradient flow structure. The idea is to rewrite the equation in the form of a convection equation with flux being \(-f(u)\partial _xq\), and q is obtained by a piecewise \(L^2\) projection of \(\Phi (x) +H'(u)\). Then we apply the numerical flux of the DDG method introduced in [20] to \(\partial _x q\). The presented scheme is shown to satisfy a discrete version of the entropy dissipation law, therefore preserving steady-states and providing numerical solutions with satisfying long-time behavior. The positivity of numerical solutions is enforced through a reconstruction algorithm, based on positive cell averages. Cell averages can be made positive at each time step by carefully tuning the numerical flux parameter \((\beta _0, \beta _1)\). For the model with trivial potential, a parameter range sufficient for positivity preservation is rigorously established. Numerical examples include the porous medium equation, the nonlinear diffusion equation with a double-well potential, and the general Fokker–Planck equation. Numerical results have demonstrated high-order accuracy of the scheme. Moreover, the long-time solution behavior is also examined to show the robustness of the proposed scheme.