1 Introduction

In the present paper, we consider the distributed order time-fractional diffusion equation with corresponding initial and boundary conditions as following:

$$\begin{aligned}&\mathscr {D}^\omega _t u-\kappa \varDelta u=f(x,t) \quad \forall ~ (x,t)\in Q:=\varOmega \times (0,T], \end{aligned}$$
(1.1a)
$$\begin{aligned}&u|_{\partial \varOmega }=0 \quad \text {for } t\in (0,T], \end{aligned}$$
(1.1b)
$$\begin{aligned}&u(x,0)= {u_0(x)} \quad \text {for } x\in \varOmega , \end{aligned}$$
(1.1c)

where \(\varOmega \subset \mathbb {R}^d ~(d=1,2,3), \kappa \) is a positive constant, and \( f\in C(\overline{Q})\) with \(\overline{Q}:=\varOmega \times [0,T]\). In (1.1a), \(\mathscr {D}^\omega _t u\) denotes the distributed order fractional derivative, which is defined by

$$\begin{aligned} \mathscr {D}^\omega _t u(x,t)=\int _0^\beta \omega (\alpha )D_t^\alpha u(x,t)\,d\alpha , \quad 0<\beta \le 1, \end{aligned}$$
(1.2)

where \(\omega (\alpha )\ge 0\), \(\int _0^\beta \omega (\alpha )\mathrm{d}\alpha =c_0>0\), \(D_t^\alpha u\) \((0<\alpha <1)\) is the fractional Caputo derivative of order \(\alpha \), defined by

$$\begin{aligned} D_t^\alpha u(x,t)=\frac{1}{\varGamma (1-\alpha )}\int _0^t(t-s)^{-\alpha }\frac{\partial u(x,s)}{\partial s}\,ds, \quad t>0. \end{aligned}$$

The analytic solutions of the distributed order time-fractional diffusion equation have been studied by many researchers [13, 24, 29, 31]. However, only for a few problems the exact solutions can be displayed, and most of these solutions are consist of complex functions (Mittag-Leffler function, Wright function, etc.), which are not easy to compute. Thus it is very necessary to develop some efficient numerical methods to solve the distributed order time-fractional diffusion equation. Alikhanov [2] presented a priori estimates for the multi-term variable-distributed order diffusion equation by the method of the energy inequalities and investigated a difference scheme to solve it. Ye et al. [43] proposed a compact difference scheme for the problem (1.1) and got the stability and optimal convergent result for the proposed scheme. The numerical analysis of a finite difference method for the time distributed-order and Riesz space fractional diffusion equation was presented in [44]. Chen et al. [9] developed a fully discrete spectral method for the distributed order time-fractional reaction-diffusion equation, which will achieve the spectral accuracy. Bu et al. [8] investigated three efficient fully discrete finite element schemes to solve problem (1.1). Li et al. [23] developed two alternating direction implicit Galerkin-Legendre spectral methods for distributed-order differential equation in two-dimensional space. Samiee et al. [34, 35] proposed a unified and fast Petrov-Galerkin spectral method for distributed-order partial differential equations, where Jacobi poly-fractonomials and Legendre polynomials were employed as temporal and spatial basis/test functions, respectively. Furthermore, some recent developments are given in [1, 38, 42].

It is worth noting that the analysis of the above schemes is based on the assumption that the solution is smooth enough in time direction. However, this assumption is unrealistic. The solution of the time-fractional partial differential equation typically exhibits a weak singularity near the initial time. Mclean [30] investigated the regularity result of solutions for time fractional diffusion equations and discovered these singular behaviors. Stynes et al. [39] investigated the L1 scheme on graded mesh to solve these weak singularities of the time-fractional reaction diffusion equation. Liao et al. [27] developed a discrete fractional Grönwall inequality on nonuniform mesh, which can be used to solve the time-fractional nonlinear problem with a weakly singular solution. Ren and Chen [32] investigated a finite difference/spectral method to approximate a distributed order time fractional diffusion equation with initial singularity on two dimensional spatial domain, while the convergent result show that the bound will blow up as \(\beta \rightarrow 1^-\). Bu et al. [7] proposed a space-time finite element method for the distributed order time fractional reaction diffusion equation with weakly singular solution. Moreover, there are some other relevant works about the weakly singular solution for the time-fractional partial differential equation, e.g. the finite difference method [22, 26, 36], the finite element method [3, 20, 21, 41], the discontinuous Galerkin method [4, 5, 14, 15, 17, 33], the collocation method [19, 25].

Let p be a non-negative integer. Assume that \(u_0 \in D({\varDelta }^{{p}+2})\) and \(\partial _t^l f(\cdot ,t) \in D({\varDelta }^{{p}})\) for \(l=1,2\), where the fractional power \(\varDelta ^p\) is defined in [16, p.3]. Imitating [16, Theorem 2.1], we obtained that the solution of the initial-boundary value problem (1.1) satisfies

$$\begin{aligned} \big \Vert u(\cdot , t)\big \Vert _{p} \le C, ~~\big \Vert \partial _t^l u(\cdot , t)\big \Vert _p \le C(1+t^{\sigma -l}),~~\big \Vert \mathscr {D}^\omega _t u(\cdot , t)\big \Vert _{p} \le C \end{aligned}$$
(1.3)

with \(l=0,1,2\), and \(0< \sigma <1\).

In this paper, we will construct the finite difference/finite element method to solve the initial-boundary value problem (1.1), whose solutions behave a weak singularity as (1.3). In order to obtain the sharp \(H^1\)-norm convergent result, the fully discrete L1 finite element method with integral formula will be written as differential formula. By investigating a \(\beta \)-robust discrete Grönwall inequality, the \(\beta \)-robust \(H^1\)-norm stability and convergent results are obtained. Furthermore, the superconvergent result in space direction will achieve.

The rest of the paper will organized as follows. In Sect. 2, several operators will be introduced, which will be used in following error estimate. In Sect. 3, we will construct a fully discrete scheme, which is based on the L1 scheme in time direction and the finite element method in space direction. The sharp \(H^1\)-norm stability and convergent result will be presented in Sect. 4, while these bounds will blow up as \(\beta \rightarrow 1^-\). To overcome this, a new analysis will be presented in Sect. 5, and the \(\beta \)-robust stability and convergent results are obtained by using a new \(\beta \)-robust Grönwall inequality. The superconvergent result in space direction is given in Sect. 6.Finally, in Sect. 7, the numerical experiments are presented to illustrate the sharpness of our theoretical analysis.

Notation. C denotes a generic constant, it is independent of mesh parameters N and h and can take different values in different places. We write \(\Vert \cdot \Vert _\infty \) and \(\Vert \cdot \Vert \) for the norms in \(L^\infty (\varOmega )\) and \(L^2(\varOmega )\) respectively. For each \(m\in \mathbb {N}\), the notation \(H^m(\varOmega )\) is used for the standard Sobolev space with its associated norm \(\Vert \cdot \Vert _m\) and seminorm \(|\cdot |_m\). The \(L^2(\varOmega )\) inner product is denoted by \((\cdot ,\cdot )\).

2 Preliminaries

Recall that \(\varOmega \subset \mathbb {R}^d\). To construct standard finite element space, we write \(Q_k\) for the space of polynomials of degree k in d variables. Denote the reference element \(\widehat{K}:=[0,1]^d\). Let \(\mathscr {T}_h\) be a quasiuniform partition of \(\varOmega \) (see Figure 1) into elements \(K_m\) for \(m=1,\ldots , M\), where each \(K_m = q_m({\hat{K}})\) for some \(q_m\in Q_k\). Then the standard mapped \(Q_k\) functions are used on each element (see, e.g., [12, Section 3.7]), set

$$\begin{aligned} V_h&:= \left\{ v_h\in H^1(\varOmega ): v_h\big | _{K_m}=\xi \circ q_m^{-1}~\text {with}~\xi \in Q_k(\widehat{K})~\text {and}~ q_m: \widehat{K}\rightarrow K_m\right\} ,\\ V_{0h}&:=\left\{ v_h\in V_h: v_h| _{\partial \varOmega }=0\right\} , \end{aligned}$$

where \(h:=\max _{1\le m\le M} \{\text {diam}(K_m)\}\) is the mesh diameter.

Fig. 1
figure 1

Quasiuniform partition of \(\varOmega \)

Next we will introduce three operators. First, define the \(L^2\) projector \(P_h: L^2(\varOmega ) \rightarrow V_{0h}\) by

$$\begin{aligned} (P_hw, v_h) = (w,v_h)~\forall ~v_h\in V_{0h}. \end{aligned}$$
(2.1)

One can show [6] that

$$\begin{aligned} \Vert \nabla P_h v\Vert \le C_p\Vert \nabla v\Vert ~\forall ~ v\in H^1(\varOmega ), \end{aligned}$$
(2.2)

where \(C_p\) is a positive constant independent of the mesh size h.

The Ritz projector \(R_h: H_0^1(\varOmega )\rightarrow V_{0h}\) is defined by \(\left( \nabla R_hw,\nabla v_h\right) =\left( \nabla w,\nabla v_h\right) \) for all \(v_h\in V_{0h}\). From [40, Lemma 1.1] we get

$$\begin{aligned} \Vert w-R_hw\Vert +h\Vert w-R_hw\Vert _1\le Ch^{k+1}|w|_{k+1}~~\forall ~w\in H^{k+1}(\varOmega ) \cap H_0^1(\varOmega ). \end{aligned}$$
(2.3)

Next, we define the discrete Laplacian \(\varDelta _h: V_{0h}\rightarrow V_{0h}\) [40, (1.33)] by

$$\begin{aligned} (\varDelta _h v, w)=-(\nabla v,\nabla w) ~~\forall ~v,w \in V_{0h}. \end{aligned}$$
(2.4)

Imitating [40, (1.34)], we get that these three operators are related by

$$\begin{aligned} \varDelta _h R_h v = P_h\varDelta v ~ \forall ~v\in H^2(\varOmega ). \end{aligned}$$
(2.5)

3 The L1 FEM Method

In this section, we will approximate the problem (1.1) using the finite element method in space and the well-known L1 difference scheme in time, on a mesh that is uniform in space and graded in time.

3.1 Temporal Discretisation

Let q be a positive integer. Divide the interval \([0,\beta ]\) into q-subintervals with \(h_\alpha =\beta /q\). Denote \(\varvec{\alpha }:=(\alpha _1,\ldots , \alpha _q)\), and \(\mathbb {D}_t^{\varvec{\alpha }}u:=h_\alpha \sum _{s=1}^q\omega (\alpha _s)D_t^{\alpha _s}u\) with \(\alpha _s=(s-1/2)h_\alpha \) for \(s=1,\cdots , q\). Note that \(\alpha _s\) is the center of each cell \([(s-1)h_\alpha ,s h_\alpha ]\). Thus the distributed order fractional derivative can be approximated by the multi-term fractional derivative

$$\begin{aligned} \mathscr {D}^\omega _t u=\mathbb {D}_t^{\varvec{\alpha }}u+R(t), \end{aligned}$$
(3.1)

where \(R(t):=\mathscr {D}^\omega _t u-\mathbb {D}_t^{\varvec{\alpha }}u\) denotes the approximation error. Under the conditions \(\omega (\alpha )\in C^2[0,\beta ]\) and \(D_t^\alpha u(\cdot ,\cdot )\in C^2[0,\beta ]\), one has \(\Vert R(t)\Vert \le Ch^2_\alpha \) by composite midpoint formula for numerical integration [11, (5.1.19)].

Let N be a positive integer. Set the temporal mesh \(t_n=T(n/N)^r\) for \(n=0,1,\dots ,N\), where the constant r satisfies \(r\ge 1\). Denote the time step \(\tau _n:=t_n-t_{n-1}\) for \(n=1,\dots ,N\).

The Caputo fractional derivative \(D_t^\alpha u(\cdot ,t_n)\) with \(0<\alpha <1\) is approximated by the well-known L1 approximation of [39, (3.1)]

$$\begin{aligned} D_N^\alpha u^n:=d_{n,1}^{(\alpha )} u^n-d_{n,n}^{(\alpha )} u^0-\sum _{i=1}^{n-1}\left( d_{n,i}^{(\alpha )}-d_{n,i+1}^{(\alpha )}\right) u^{n-i}, \end{aligned}$$
(3.2)

where

$$\begin{aligned} d_{n,i}^{(\alpha )}:=\frac{\tau _{n-i+1}^{-1}}{\varGamma (1-\alpha )}\int _{t_{n-i}}^{t_{n-i+1}}(t_n-\eta )^{-\alpha }d\eta ~~ \text {for}~~ i=1,\ldots ,n. \end{aligned}$$

Denote

$$\begin{aligned} d_{n,i}:=h_ \alpha \sum _{s=1}^q\omega (\alpha _s) d_{n,i}^{(\alpha _s)}. \end{aligned}$$
(3.3)

Thus the distributed order fractional derivative (1.2) can be approximated by

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }}u^n=d_{n,1}u^n-d_{n,n}u^0-\sum _{i=1}^{n-1}(d_{n,i}-d_{n,i+1})u^{n-i}. \end{aligned}$$
(3.4)

It is easy to see that

$$\begin{aligned} {0<}d_{n,i+1}<d_{n,i}~~\text { for }~ i\ge 1. \end{aligned}$$
(3.5)

Next, we will present three Lemmas, which will be used in our later analysis.

Lemma 3.1

[32, Lemma 2.2] For any grid function \(\{v^n\}_{n=0}^N\) , one has

$$\begin{aligned} |v^n|\le |v^0|+\varGamma (1-\alpha _q)\max _{j=1,\ldots ,n}\left\{ \left( h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s}\right) ^{-1}\mathbb {D}_N^{\varvec{\alpha }}|v^j|\right\} ~~\text {for}~ n=1,\ldots ,N. \end{aligned}$$

Lemma 3.2

[32, Lemma 2.3] Suppose \(h_\alpha \) is sufficiently small, one has

$$\begin{aligned} h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s}\ge \frac{c_0}{2} \min \{t_j^{-\alpha _1}, t_j^{-\alpha _q}\}~~ \text {for} ~j=1,\ldots ,N. \end{aligned}$$

One should bear in mind that in the rest of our paper, we always take \(h_\alpha \) sufficiently small to ensure Lemma 3.2 is valid.

Lemma 3.3

Let \(\sigma \in (0,1)\) and \(\alpha \in (0,1)\). Assume that \(\Vert u^{(l)}(\cdot , t)\Vert _1\le C(1+t^{\sigma -l})\) for \(l=0,1,2\), and \(t\in (0,T]\). Then

$$\begin{aligned} \left\| D_N^{\alpha }u(t_n)-D_t^{\alpha }u(t_n)\right\| _1\le C t_n^{-\alpha } N^{-\min \{r\sigma , 2-\alpha \}}~~\text {for}~ n=1,\ldots ,N, \end{aligned}$$

where C is a constant.

Proof

The result is followed from [32, Lemma 2.4] and [32, Lemma 2.6]. \(\square \)

3.2 The Fully Discrete L1 FEM

Firstly, we use the standard finite element method (FEM) to discretize (1.1a) in spatial direction. A weak formulation of (1.1) is: Find \(u(\cdot ,t)\in H^1_0(\varOmega )\) for each \(t\in (0,T]\), such that

$$\begin{aligned} \big (\mathscr {D}^\omega _t u ,v\big )+\kappa (\nabla u,\nabla v)=(f,v)~~\forall ~v\in H_0^1(\varOmega ) \end{aligned}$$
(3.6)

with \(u(x,0)=u_0\).

Our semi-discrete FEM is: Find \(u_h(\cdot , t) \in V_{0h}\) for each \(t\in (0,T]\), such that

$$\begin{aligned} \left( \mathscr {D}^\omega _t u_h, v_h\right) +\kappa (\nabla u_h,\nabla v_h)=(f,v_h)~~\forall ~v_h\in V_{0h} \end{aligned}$$
(3.7)

with \(u_h^0=R_hu_0\).

Applying the L1 scheme (3.4) to approximate (3.7) in temporal direction, the fully discrete L1 finite element method (L1 FEM) takes the form: find \(u_h^n \in V_{0h}\) for \(n=0,1,\dots ,N\) such that

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} u_h^n, v_h\right) +\kappa (\nabla u_h^n,\nabla v_h)=(f^n,v_h)~~\text {with }~u_h^0=R_hu_0, \end{aligned}$$
(3.8)

where \(f^n(\cdot ) := f(\cdot , t_n)\).

By (2.4) and (2.1), the L1 FEM (3.8) can be written as: find \(u_h^n \in V_{0h}\) for \(n=0,1,\dots ,N\) such that

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} u_h^n,v_h\right) -\kappa (\varDelta _h u_h^n,v_h)=(P_hf^n,v_h)~~\text {with}~u_h^0=R_hu_0 \end{aligned}$$

for \(n=1,\dots ,N\). Owing to \(\mathbb {D}_N^{\varvec{\alpha }} u_h^n, \varDelta _h u_h^n\) and \(P_hf^n\) all belong to \(V_{0h}\), this integral formulation of L1 FEM takes the differential form: find \(u_h^n \in V_{0h}\) for \(n=0,1,\dots ,N\) such that

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }} u_h^n -\kappa \varDelta _h u_h^n=P_hf^n ~~\text {with}~u_h^0=R_hu_0 \end{aligned}$$
(3.9)

for \(n=1,\dots ,N\).

4 \(H^1\)-norm Stability of L1 FEM

In this section, we will present the sharp \(H^1\)-norm stability and convergent results of the computed solution given in (3.9).

Next the following important property of the L1 scheme will be stated.

Lemma 4.1

Let the functions \(v^j = v(\cdot , t_j)\) be in \(L^2(\varOmega )\) for \(j=0,1,\dots , N\). Then the discrete L1 scheme satisfies

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} v^n,v^n\right) \ge {\left( \mathbb {D}_N^{\varvec{\alpha }}\Vert v^n\Vert \right) }~ \Vert v^n\Vert \ \text { for }n=1,2,\ldots ,N. \end{aligned}$$

Proof

Let \(n\in \{1,2,\ldots ,N\}\). The definition of \(\mathbb {D}_N^{\varvec{\alpha }} v^n\) gives

$$\begin{aligned} (\mathbb {D}_N^{\varvec{\alpha }} v^n,v^n)&= d_{n,1}(v^n, v^n) -d_{n,n}(v^0, v^n) -\sum _{i=1}^{n-1} (d_{n,i}-d_{n,i+1})(v^{n-i}, v^n) \\&\ge d_{n,1}\Vert v^n\Vert ^2 -d_{n,n}\Vert v^0\Vert \, \Vert v^n\Vert -\sum _{i=1}^{n-1} (d_{n,i}-d_{n,i+1})\Vert v^{n-i}\Vert \,\Vert v^n\Vert \\&= \left( \mathbb {D}_N^{\varvec{\alpha }} v^n\right) \Vert v^n\Vert , \end{aligned}$$

where we used Cauchy-Schwarz inequalities and \(0< d_{n,i+1}<d_{n,i}\) given in (3.5). \(\square \)

Now we give the stability of the L1 FEM (3.9) in following lemma.

Lemma 4.2

The solution \(u_h^n\) of the discrete problem (3.9) satisfies

$$\begin{aligned} \Vert \nabla u_h^n\Vert \le \Vert \nabla u_h^0\Vert +C_p\varGamma (1-\beta )\frac{2T^\beta }{c_0}\max _{j=1,\ldots ,n}\Vert \nabla f^j\Vert . \end{aligned}$$
(4.1)

Proof

Fix \(n\in \{1,2,\ldots , N\}\). Multiplying (3.9) by \(-\varDelta _h u_h^n\) and integrating over \(\varOmega \), one has

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }}u_h^n, -\varDelta _hu_h^n\right) +\kappa \Vert \varDelta _h u_h^n\Vert ^2=(P_hf^n,-\varDelta _h u_h^n). \end{aligned}$$

Applying \(\kappa >0\), then use the definition (2.4) of \(\varDelta _h\) to get

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }}(\nabla u_h^n),\nabla u_h^n\right) \le (\nabla P_hf^n,\nabla u_h^n). \end{aligned}$$

Now Lemma 4.1, a Cauchy-Schwarz inequality, and (2.2) yield

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} \Vert \nabla u_h^n\Vert \right) \Vert \nabla u_h^n\Vert \le \Vert \nabla P_h f^n\Vert \,\Vert \nabla u_h^n\Vert \le C_p\Vert \nabla f^n\Vert \,\Vert \nabla u_h^n\Vert . \end{aligned}$$
(4.2)

Invoking (4.2) and Lemmas 3.1 and 3.2, one has

$$\begin{aligned} \Vert \nabla u_h^n\Vert&\le \Vert \nabla u_h^0\Vert +\varGamma (1-\alpha _q)\max _{j=1,\ldots ,n}\left\{ \big (h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s}\big )^{-1}\mathbb {D}_N^{\varvec{\alpha }}\Vert \nabla u_h^j\Vert \right\} \\&\le \Vert \nabla u_h^0\Vert +C_p\varGamma (1-\alpha _q)\max _{j=1,\ldots ,n}\left\{ \big (h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s}\big )^{-1}\Vert \nabla f^j\Vert \right\} \\&\le \Vert \nabla u_h^0\Vert +C_p\varGamma (1-\beta )\frac{2}{c_0}\max _{j=1,\ldots ,n}\left\{ \max \{t_j^{\alpha _1},t_j^{\alpha _q}\}~\Vert \nabla f^j\Vert \right\} \\&\le \Vert \nabla u_h^0\Vert +C_p\varGamma (1-\beta )\frac{2T^\beta }{c_0}\max _{j=1,\ldots ,n}\Vert \nabla f^j\Vert , \end{aligned}$$

where Lemma 3.2 is used for the penultimate inequality. \(\square \)

Let \(u^n\) and \(u^n_h\) be the solutions of (1.1) and (3.9) respectively at time \(t=t_n\) for \(n = 0,1,\dots , N\). Denote \(R^n:=\mathscr {D}^\omega _t u(t_n)-\mathbb {D}_t^{\varvec{\alpha }}u(t_n)\) for \(n = 0,1,\dots , N\). To facilitate the error analysis for the standard finite element method, we follow the writing

$$\begin{aligned} u^n-u_h^n=(R_hu^n-u^n_h)-(R_hu^n-u^n)=\zeta ^n-\rho ^n, \end{aligned}$$
(4.3)

where \(\zeta ^n:=R_hu^n-u^n_h\) and \(\rho ^n:=R_hu^n-u^n\). Now we consider the analysis of \(\zeta ^n\), because the bound of \(\rho ^n\) can be approximated immediately using (2.3). From (1.1a), (3.9), and (2.5), one has

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }} \zeta ^n-\kappa \varDelta _h\zeta ^n&=\left[ R_h\left( \mathbb {D}_N^{\varvec{\alpha }} u^n\right) -\kappa \varDelta _hR_hu^n\right] -\left( \mathbb {D}_N^{\varvec{\alpha }} u_h^n -\kappa \varDelta _h u_h^n\right) \\&=(R_h-P_h) \mathbb {D}_N^{\varvec{\alpha }} u^n+P_h(\mathbb {D}_N^{\varvec{\alpha }} u^n-\kappa \varDelta u^n)-P_hf^n\\&{=P_h(R_h-I) \mathbb {D}_N^{\varvec{\alpha }} u^n+P_h(\mathscr {D}^\omega _t u^n-\kappa \varDelta u^n+\mathbb {D}_N^{\varvec{\alpha }} u^n-\mathscr {D}^\omega _t u^n)-P_hf^n}\\&{=P_h(R_h-I) \mathbb {D}_N^{\varvec{\alpha }} u^n+P_h(f^n+\mathbb {D}_N^{\varvec{\alpha }} u^n-\mathbb {D}_t^{\varvec{\alpha }}u^n-R^n)-P_hf^n}\\&=P_h(R_h-I)\mathbb {D}_N^{\varvec{\alpha }} u^n+P_h(f^n-\varphi ^n-R^n)-P_hf^n\\&=P_h\mathbb {D}_N^{\varvec{\alpha }} \rho ^n-P_h(\varphi ^n+R^n), \end{aligned}$$

where \(\varphi ^n(x) :={\mathbb {D}_t^{\varvec{\alpha }}u^n-\mathbb {D}_N^{\varvec{\alpha }} u^n=\sum _{s=1}^qh_{\alpha }\omega (\alpha _s)[D_t^{\alpha _s} u(x,t_n)-D_N^{\alpha _s} u(x,t_n)]}\). Clearly

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }}\rho ^n&= \mathbb {D}_N^{\varvec{\alpha }}\rho ^n- \mathbb {D}_t^{\varvec{\alpha }}\rho ^n+ \mathbb {D}_t^{\varvec{\alpha }}\rho ^n-\mathscr {D}^\omega _t\rho ^n+\mathscr {D}^\omega _t\rho ^n\nonumber \\&= \left( \mathbb {D}_t^{\varvec{\alpha }} u^n- \mathbb {D}_N^{\varvec{\alpha }} u^n\right) - R_h\left( \mathbb {D}_t^{\varvec{\alpha }} u^n-\mathbb {D}_N^{\varvec{\alpha }} u^n\right) +\left( \mathscr {D}^\omega _tu^n-\mathbb {D}_t^{\varvec{\alpha }} u^n\right) \nonumber \\&\quad - R_h\left( \mathscr {D}^\omega _tu^n-\mathbb {D}_t^{\varvec{\alpha }} u^n\right) + \mathscr {D}^\omega _t\rho ^n\nonumber \\&=\varphi ^n+R^n-R_h(\varphi ^n+R^n)+\mathscr {D}^\omega _t\rho ^n. \end{aligned}$$
(4.4)

Thus

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }} \zeta ^n-\kappa \varDelta _h\zeta ^n =P_h\mathscr {D}^\omega _t\rho ^n -R_h(\varphi ^n+R^n)\ \text { for }n=1,2,\ldots , N. \end{aligned}$$
(4.5)

We can now prove the optimal-rate convergence of our L1 FEM (3.9) in \(L^\infty (H^1)\).

Theorem 4.1

(Error estimate for the L1 FEM) Let \(u^n\) and \(u_h^n\) be the solutions of (1.1) and (3.9), respectively. Assume the hypotheses of (1.3) with \(k+1\le p\), \(\omega (\alpha )\in C^2[0,\beta ]\), and \(D_t^\alpha u(\cdot ,\cdot )\in C^2[0,\beta ]\). Then for \(n=1,2,\dots ,N\), there exists a constant C such that

$$\begin{aligned} \Vert \nabla u^n-\nabla u^n_h\Vert \le C \varGamma (1-\beta )\left( h^k+h_\alpha ^2+N^{-\min \{r\sigma , 2-\alpha _q\}}\right) . \end{aligned}$$
(4.6)

Proof

Fix \(n\in \{1,2,\ldots , N\}\). Multiplying (4.5) by \(-\varDelta _h \zeta ^n\) and integrating over \(\varOmega \), one has

$$\begin{aligned} -\left( \mathbb {D}_N^{\varvec{\alpha }} \zeta ^n, \varDelta _h \zeta ^n\right) +\Vert \varDelta _h \zeta ^n\Vert ^2=-\left( P_h\mathscr {D}^\omega _t\rho ^n,\varDelta _h \zeta ^n\right) +(R_h(\varphi ^n+R^n),\varDelta _h \zeta ^n). \end{aligned}$$
(4.7)

Recalling the definition (2.4) of \(\varDelta _h\) and the projection \(P_h\), we get

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} (\nabla \zeta ^n), \nabla \zeta ^n\right) +\Vert \varDelta _h \zeta ^n\Vert ^2 =\left( \nabla P_h\mathscr {D}^\omega _t\rho ^n,\nabla \zeta ^n\right) -\left( \nabla R_h(\varphi ^n+R^n), \nabla \zeta ^n\right) . \end{aligned}$$

By Lemma 4.1 and a Cauchy-Schwarz inequality, this gives

$$\begin{aligned} {\left( \mathbb {D}_N^{\varvec{\alpha }} \Vert \nabla \zeta ^n\Vert \right) }~\Vert \nabla \zeta ^n\Vert \le \Vert \nabla P_h\mathscr {D}^\omega _t\rho ^n\Vert ~\Vert \nabla \zeta ^n\Vert +\Vert \nabla R_h(\varphi ^n+R^n)\Vert ~\Vert \nabla \zeta ^n\Vert . \end{aligned}$$

Thus

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }} \Vert \nabla \zeta ^n\Vert \le C_p\Vert \nabla \mathscr {D}^\omega _t\rho ^n\Vert +\Vert \nabla (\varphi ^n+R^n)\Vert . \end{aligned}$$
(4.8)

Invoking Lemma 3.1 and (4.8), one has

$$\begin{aligned}&\Vert \nabla \zeta ^n\Vert \\&\quad \le \Vert \nabla \zeta ^0\Vert +\varGamma (1-\alpha _q)\max _{j=1,\ldots ,n}\left\{ \big (h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s}\big )^{-1}\mathbb {D}_N^{\varvec{\alpha }}\Vert \nabla \zeta ^j\Vert \right\} \\&\quad \le \varGamma (1-\alpha _q)\max _{j=1,\ldots ,n}\left\{ \big (h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s}\big )^{-1}\left[ C_p\Vert \nabla \mathscr {D}^\omega _t\rho ^j\Vert +\Vert \nabla (\varphi ^j+R^j)\Vert \right] \right\} \\&\quad \le C\varGamma (1-\alpha _q)\max _{j=1,\ldots ,n}\left\{ \big (h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s}\big )^{-1}\left[ h^k+h_\alpha ^2+h_\alpha \sum _{s=1}^q \omega (\alpha _s)t_j^{-\alpha _s} N^{-\min \{r\sigma , 2-\alpha _s\}}\right] \right\} \\&\quad \le C\varGamma (1-\beta )\max _{j=1,\ldots ,n}\left\{ \max \{t_j^{\alpha _1},t_j^{\alpha _q}\}(h^k+h_\alpha ^2)\right\} + C\varGamma (1-\beta )N^{-\min \{r\sigma , 2-\alpha _q\}}\\&\quad \le C \varGamma (1-\beta )\left( h^k+h_\alpha ^2+N^{-\min \{r\sigma , 2-\alpha _q\}}\right) , \end{aligned}$$

where Lemmas 3.2 and 3.3, (2.3), and \(\Vert \nabla \zeta ^0\Vert =\Vert \nabla (R_h u^0-u^0)\Vert =0\) are used. Combining this bound and (2.3) with (4.3), we get (5.8). \(\square \)

Remark 4.1

The orders of convergence displayed in Theorem 4.1 indicate that the rates of convergence in space, distributed variable, and time are \(h^k\), \(h_\alpha ^2\), and \(N^{-\min \{r\sigma , 2-\alpha _q\}}\), respectively.

It is obvious that the convergent result obtained in Theorem 4.1 will blow up as \(\beta \rightarrow 1^-\). This phenomenon also appears in [32]. In the next part we will try to improve this convergent result by making it \(\beta \)-robust.

5 \(\beta \)-robust \(H^1\)-norm Error Analysis of the L1 FEM

In this section, we will present a \(\beta \)-robust discrete Grönwall inequality, which is an improve of [18, Lemma 8]. Applying this new discrete Grönwall inequality, a \(\beta \)-robust \(H^1\)-norm error estimate for the computed solution is obtained. Based on this result, a superconvergent result is achieved immediately.

As in [39, (4.6)], define the positive real numbers \(\theta _{n,j}\), for \(n=1,2,\dots , N\) and \(j=1,2,\dots , n-1\), by

$$\begin{aligned} \theta _{n,n}=1,\quad \theta _{n,j}=\sum _{k=1}^{n-j}\frac{1}{d_{n-k,1}} (d_{n,k}-d_{n,k+1})\theta _{n-k,j}, \end{aligned}$$
(5.1)

where \(d_{n,k}\) is defined in (3.3). Observe that (3.5) implies \(\theta _{n,j}>0\) for all nj.

Lemma 5.1

[10, Lemma 5.1] For \(n=1,2,\ldots , N\) and \(1\le k\le n\), one has

$$\begin{aligned} \sum _{j=k}^nd_{j,j+1-k}\theta _{n,j}=d_{n,1}. \end{aligned}$$
(5.2)

Lemma 5.2

Let \(\gamma \in (0,1)\) be a constant. Then for \(n=1,2,\ldots , N\), one has

$$\begin{aligned} \sum _{j=1}^n \left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)\frac{\varGamma (1+\gamma )}{\varGamma (1+\gamma -\alpha _s)}t_j^{\gamma -\alpha _s}\right) \theta _{n,j}\le d_{n,1} t_n^{\gamma }. \end{aligned}$$
(5.3)

Proof

Imitating [10, Lemma 5.3], one has

$$\begin{aligned} \frac{\varGamma (1+\gamma )}{\varGamma (1+\gamma -\alpha _s)}t_j^{\gamma -\alpha _s}\le \sum _{k=1}^j d_{j,j+1-k}^{(\alpha _s)}\left( t_k^{\gamma }-t_{k-1}^{\gamma }\right) . \end{aligned}$$

Thus

$$\begin{aligned}&\sum _{s=1}^q h_\alpha \omega (\alpha _s)\frac{\varGamma (1+\gamma )}{\varGamma (1+\gamma -\alpha _s)}t_j^{\gamma -\alpha _s}\\&\quad \le \sum _{s=1}^qh_\alpha \omega (\alpha _s)\sum _{k=1}^j d_{j,j+1-k}^{(\alpha _s)}\left( t_k^{\gamma }-t_{k-1}^{\gamma }\right) \\&\quad =\sum _{k=1}^j\sum _{s=1}^q h_\alpha \omega (\alpha _s)d_{j,j+1-k}^{(\alpha _s)}\left( t_k^{\gamma }-t_{k-1}^{\gamma }\right) \\&\quad =\sum _{k=1}^j d_{j,j+1-k}\left( t_k^{\gamma }-t_{k-1}^{\gamma }\right) . \end{aligned}$$

Multiply this inequality by \(\theta _{n,j}\) then sum from \(j=1\) to n. This yields

$$\begin{aligned}&\sum _{j=1}^n\left( \sum _{s=1}^q h_\alpha \omega (\alpha _s)\frac{\varGamma (1+\gamma )}{\varGamma (1+\gamma -\alpha _s)}t_j^{\gamma -\alpha _s}\right) \theta _{n,j}\\&\quad \le \sum _{j=1}^n\theta _{n,j}\sum _{k=1}^j d_{j,j+1-k}\left( t_k^{\gamma }-t_{k-1}^{\gamma }\right) \\&\quad =\sum _{k=1}^n\left( t_k^{\gamma }-t_{k-1}^{\gamma }\right) \left( \sum _{j=k}^n\theta _{n,j}d_{j,j+1-k}\right) \\&\quad =d_{n,1}\sum _{k=1}^n\left( t_k^{\gamma }-t_{k-1}^{\gamma }\right) \\&\quad =d_{n,1}t_n^{\gamma }, \end{aligned}$$

by changing the order of summation then invoking Lemma 5.1. \(\square \)

Corollary 5.1

Setting \(l_N=1/\ln N\), one has

$$\begin{aligned} \frac{1}{d_{n,1}}\sum _{j=1}^n \left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)t_j^{-\alpha _s}\right) \theta _{n,j}\le \frac{e^r\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}. \end{aligned}$$

Proof

Applying \(1\le j^{rl_N}\) and \(N^{rl_N}=e^r\), one has

$$\begin{aligned}&\frac{1}{d_{n,1}}\sum _{j=1}^n \left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)t_j^{-\alpha _s}\right) \theta _{n,j}\\&\quad \le \frac{1}{d_{n,1}}\sum _{j=1}^n \theta _{n,j}\left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)t_j^{-\alpha _s}j^{rl_N}\right) \\&\quad =\frac{N^{rl_N}}{T^{l_N}} \frac{1}{d_{n,1}}\sum _{j=1}^n \theta _{n,j}\left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)t_j^{l_N-\alpha _s}\right) \\&\quad \le \frac{e^r\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{T^{l_N}\varGamma (1+l_N)} \frac{1}{d_{n,1}}\sum _{j=1}^n \theta _{n,j}\left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)\frac{\varGamma (1+l_N)}{\varGamma (1+l_N-\alpha _s)}t_j^{l_N-\alpha _s}\right) . \end{aligned}$$

Choosing \(\gamma =l_N\) in Lemma 5.2 yields

$$\begin{aligned}&\frac{1}{d_{n,1}}\sum _{j=1}^n \left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)t_j^{-\alpha _s}\right) \theta _{n,j}\\&\quad \le \frac{e^r\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{T^{l_N}\varGamma (1+l_N)}t_n^{l_N}\le \frac{e^r\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}, \end{aligned}$$

where \(t_n^{l_N}\le T^{l_N}\) is used. Thus the result follows. \(\square \)

Corollary 5.2

Setting \(l_N=1/\ln N\), one has

$$\begin{aligned} \frac{1}{d_{n,1}}\sum _{j=1}^n\theta _{n,j}\le \frac{e^r\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\frac{2}{c_0}\max \{t_n^{\alpha _1}, t_n^{\alpha _q}\}. \end{aligned}$$

Proof

From Lemma 3.2 we have

$$\begin{aligned}&\frac{1}{d_{n,1}}\sum _{j=1}^n \left( \sum _{s=1}^qh_\alpha \omega (\alpha _s)t_j^{-\alpha _s}\right) \theta _{n,j}\\&\quad \ge \frac{1}{d_{n,1}}\sum _{j=1}^n \frac{c_0}{2}\min \{t_j^{-\alpha _1}, t_j^{-\alpha _q}\}\theta _{n,j} \nonumber \\&\quad \ge \frac{c_0}{2}\min \{t_n^{-\alpha _1}, t_n^{-\alpha _q}\}\frac{1}{d_{n,1}}\sum _{j=1}^n\theta _{n,j}. \end{aligned}$$

Thus the result follows from Corollary 5.1 and \(\frac{1}{\min \{t_n^{-\alpha _1}, t_n^{-\alpha _q}\}}\le \max \{t_n^{\alpha _1}, t_n^{\alpha _q}\}\). \(\square \)

Next we will prove a new nonstandard Grönwall inequality, which is an improvement of [18, Lemma 8].

Lemma 5.3

Assume that the sequences \(\{\xi ^n\}_{n=1}^\infty , \{\eta ^n\}_{n=1}^\infty \) are nonnegative and the grid function \(\{\,v^n : \, n=0,1,\dots , N\}\) satisfies \(v^0 \ge 0\) and

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} v^n\right) v^n\le \xi ^n v^n+(\eta ^n)^2 \ \text { for } \ n=1,2,\dots ,N. \end{aligned}$$
(5.4)

Then

$$\begin{aligned} v^n\le v^0+\frac{1}{d_{n,1}}\sum _{j=1}^n \theta _{n,j}(\xi ^j+\eta ^j)+\max _{1\le j\le n}\left\{ \eta ^j\right\} \ \text { for }\ n=1,2,\dots ,N. \end{aligned}$$
(5.5)

Proof

For \(n=1,\dots , N\), set

$$\begin{aligned} \eta ^n_*:= \max _{1\le j\le n}\left\{ \eta ^j\right\} , \quad g^n:=\xi ^n +\eta ^n. \end{aligned}$$

Our proof uses induction on n. First consider the case \(n=1\). If \(v^1\le \eta ^1_*\), then as \(v^0, \theta _{n,j}\) and \(\xi _j\) are all non-negative, the result for \(n=1\) follows immediately. Otherwise \(v^1> \eta ^1_*\), which implies \(v^1> \eta ^1 \ge 0\). Hence the inequality (5.4) with \(n=1\) gives us

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }} v^1 \le \xi ^1 +\eta ^1=g^1, \end{aligned}$$

i.e.,

$$\begin{aligned} d_{1,1}v^1-d_{1,1}v^0\le g^1 \end{aligned}$$

by (3.4). Rearranging this inequality then invoking (5.1), one has

$$\begin{aligned} v^1\le v^0+\frac{1}{d_{1,1}}g^1&=v^0+\frac{1}{d_{1,1}}\sum _{j=1}^1\theta _{1,1}g^1\\&\le v^0+\frac{1}{d_{1,1}}\sum _{j=1}^1\theta _{1,1}g^1+\eta ^1_*. \end{aligned}$$

Thus the result is true for \(n=1\).

Fix \(k\in \{2,\ldots ,N\}\). Assume that (5.5) is valid for \(n=1,2,\ldots ,k-1\). If \(v^k\le \eta ^k_*\), then as \(v^0, \theta _{n,j}\) and \(\xi _j\) are all non-negative, the result for \(n=k\) follows immediately. Otherwise \(v^k> \eta ^k_*\), which implies \(v^k> \eta ^k \ge 0\). Hence the inequality (5.4) with \(n=k\) gives us

$$\begin{aligned} \mathbb {D}_N^{\varvec{\alpha }} v^k \le \xi ^k +\eta ^k=g^k, \end{aligned}$$

i.e.,

$$\begin{aligned} d_{k,1}v^k-d _{k,k}v^0+\sum _{j=1}^{k-1}(d_{k,j+1}-d_{k,j})v^{k-j}\le g^k, \end{aligned}$$

by (3.4). This is equivalent to

$$\begin{aligned} v^k \le \frac{1}{d _{k,1}}\left[ g^k +d _{k,k}v^0+\sum _{j=1}^{k-1}(d_{k,j}-d_{k,j+1})v^{k-j}\right] . \end{aligned}$$

Combining this inequality with the inductive hypothesis yields

$$\begin{aligned} v^k&\le \frac{1}{d _{k,1}}\left\{ g^k +d _{k,k}v^0 +\sum _{j=1}^{k-1}(d_{k,j}-d_{k,j+1})\left[ v^0+\frac{1}{d _{k-j,1}}\sum _{s=1}^{k-j}\theta _{k-j,s}g ^s+\eta ^{k-j}_*\right] \right\} \nonumber \\&\le \frac{1}{d_{k,1}}\left\{ g^k +d _{k,k}v^0+\sum _{j=1}^{k-1}(d_{k,j}-d_{k,j+1})\bigg [v^0+\frac{1}{d _{k-j,1}}\sum _{s=1}^{k-j}\theta _{k-j,s}g ^s \bigg ]\right\} \nonumber \\&\quad + \frac{1}{d _{k,1}}\sum _{j=1}^{k-1}(d_{k,j}-d_{k,j+1}) \eta ^{k-j}_*\nonumber \\&\le \frac{1}{d _{k,1}}\left\{ g^k +d _{k,1}v^0+\sum _{j=1}^{k-1}\bigg [(d_{k,j}-d_{k,j+1})\frac{1}{d _{k-j,1}} \sum _{s=1}^{k-j}\theta _{k-j,s}g ^s \bigg ]\right\} \nonumber \\&\quad +\frac{1}{d _{k,1}}(d_{k,1}-d_{k,k})\eta ^{k-1 }_*\nonumber \\&\le \frac{1}{d _{k,1}}\left\{ g^k +d _{k,1}v^0+\sum _{s=1}^{k-1}g ^s\bigg [\sum _{j=1}^{k-s}\frac{1}{d _{k-j,1}}(d_{k,j}-d_{k,j+1})\theta _{k-j,s} \bigg ]\right\} \nonumber \\&\quad +\eta ^{k}_*\nonumber \\&= v^0+\frac{1}{d _{k,1}} \sum _{s=1}^k\theta _{k,s}g^s+\eta ^{k}_*, \end{aligned}$$

where we used the relationship (5.1), \(d_{k,k}>0\), and \(\eta ^n_*\) is nondecreasing for n increasing. We conclude the lemma is true by the principle of induction. \(\square \)

Now we will achieve a \(\beta \)-robust stability result for the fully discrete L1 FEM by Lemma 5.3.

Theorem 5.1

The solution \(u_h^n\) of the discrete problem (3.9) satisfies

$$\begin{aligned} \Vert \nabla u_h^n\Vert \le \Vert \nabla u_h^0\Vert +\frac{C\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\max _{1\le j\le n}\left\{ \Vert f^j\Vert \right\} , \end{aligned}$$
(5.6)

where \(l_N\) is defined in Corollary 5.2.

Proof

Fix \(n\in \{1,2,\ldots , N\}\). Multiplying (3.9) by \(-\varDelta _h u_h^n\) and integrating over \(\varOmega \), one has

$$\begin{aligned} -\left( \mathbb {D}_N^{\varvec{\alpha }}u_h^n, \varDelta _hu_h^n\right) +\kappa \Vert \varDelta _h u_h^n\Vert ^2=-(P_hf^n,\varDelta _h u_h^n)\le \frac{1}{4\kappa }\Vert f^n\Vert ^2+\kappa \Vert \varDelta _h u_h^n\Vert ^2. \end{aligned}$$

Discard the non-negative term \(\Vert \varDelta _h u_h^n\Vert ^2\), then use the definition (2.4) of \(\varDelta _h\) to get

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }}(\nabla u_h^n),\nabla u_h^n\right) \le \frac{1}{4\kappa }\Vert f^n\Vert ^2. \end{aligned}$$

Applying Lemma 4.1 yields

$$\begin{aligned} (\mathbb {D}_N^{\varvec{\alpha }}\Vert \nabla u_h^n\Vert )~\Vert \nabla u_h^n\Vert \le \frac{1}{4\kappa }\Vert f^n\Vert ^2. \end{aligned}$$
(5.7)

By Lemma 5.3, one has

$$\begin{aligned} \Vert \nabla u_h^n\Vert \le \Vert \nabla u_h^0\Vert +\frac{1}{2\sqrt{\kappa }}\frac{1}{d_{n,1}}\sum _{j=1}^n \theta _{n,j}\Vert f^j\Vert +\frac{1}{2\sqrt{\kappa }}\max _{1\le j\le n}\left\{ \Vert f^j\Vert \right\} . \end{aligned}$$

Applied Corollary 5.2, the lemma is proved. \(\square \)

We then prove the main result of the paper, which demonstrates convergence of our method in \(L^\infty (H^1)\) with an optimal and \(\beta \)-robust convergence rate.

Theorem 5.2

(Error estimate) Let \(u^n\) and \(u_h^n\) be the solutions of (1.1) and (3.9), respectively. Assume the hypotheses of (1.3) with \(k+1\le p\)\(\omega (\alpha )\in C^2[0,\beta ]\), and \(D_t^\alpha u(\cdot ,\cdot )\in C^2[0,\beta ]\). Then for \(n=1,2,\dots ,N\), one has

$$\begin{aligned} \Vert \nabla u^n-\nabla u^n_h\Vert&\le \frac{C\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\left( N^{-\min \{2-\alpha _q,r\sigma \}}+h^{k}+h_\alpha ^2\right) , \end{aligned}$$
(5.8)
$$\begin{aligned} \Vert \nabla R_hu^n-\nabla u^n_h\Vert&\le \frac{C\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\left( N^{-\min \{2-\alpha _q,r\sigma \}}+h^{k+1}+h_\alpha ^2\right) . \end{aligned}$$
(5.9)

where \(l_N=1/ln N\) and C is a constant independent of h and N.

Proof

Fix \(n\in \{1,2,\ldots , N\}\). Multiplying (4.5) by \(-\varDelta _h \zeta ^n\) and integrating over \(\varOmega \), we arrive at

$$\begin{aligned} -\left( \mathbb {D}_N^{\varvec{\alpha }} \zeta ^n, \varDelta _h \zeta ^n\right) +\kappa \Vert \varDelta _h \zeta ^n\Vert ^2=-\left( P_h\mathscr {D}^\omega _t\rho ^n-R_h R^n,\varDelta _h \zeta ^n\right) +(R_h\varphi ^n,\varDelta _h \zeta ^n).\nonumber \\ \end{aligned}$$
(5.10)

Recalling the definition (2.4) of \(\varDelta _h\) and the projection \(P_h\) yileds

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} (\nabla \zeta ^n), \nabla \zeta ^n\right) +\kappa \Vert \varDelta _h \zeta ^n\Vert ^2 =\left( -\mathscr {D}^\omega _t\rho ^n+R_hR^n,\varDelta _h\zeta ^n\right) -\left( \nabla R_h\varphi ^n, \nabla \zeta ^n\right) . \end{aligned}$$

Applying the Lemma 4.1 and the Cauchy-Schwartz inequality, one has

$$\begin{aligned} (\mathbb {D}_N^{\varvec{\alpha }} \Vert \nabla \zeta ^n\Vert )~\Vert \nabla \zeta ^n\Vert \le \frac{1}{4\kappa }\Vert \mathscr {D}^\omega _t\rho ^n-R_hR^n\Vert ^2+\Vert \nabla R_h\varphi ^n\Vert ~ \Vert \nabla \zeta ^n\Vert . \end{aligned}$$

Invoking (2.2) and (2.3), we get

$$\begin{aligned} \left( \mathbb {D}_N^{\varvec{\alpha }} \Vert \nabla \zeta ^n\Vert \right) ~ \Vert \nabla \zeta ^n\Vert&\le \frac{1}{4\kappa }(\Vert \mathscr {D}^\omega _t\rho ^n\Vert +\Vert R_hR^n\Vert )^2+\Vert \nabla R_h\varphi ^n\Vert ~ \Vert \nabla \zeta ^n\Vert \nonumber \\&\le Ch^{2(k+1)}+C\Vert \nabla R_hR^n\Vert ^2+\Vert \nabla R_h\varphi ^n\Vert ~ \Vert \nabla \zeta ^n\Vert \nonumber \\&\le C(h^{2(k+1)}+h_\alpha ^4)+\Vert \nabla \varphi ^n\Vert ~ \Vert \nabla \zeta ^n\Vert , \end{aligned}$$
(5.11)

where the inequality \(\Vert \nabla R_h w\Vert \le \Vert \nabla w\Vert \,\forall w\in H_0^1(\varOmega )\) is used. Observe that (5.11) is a particular case of (5.4). Thus we can invoke Lemma 5.3 to get

$$\begin{aligned} \Vert \nabla \zeta ^n\Vert \le \Vert \nabla \zeta ^0\Vert +\frac{C}{d _{n,1}}\sum _{j=1}^{n}\theta _{n,j}\left( \Vert \nabla \varphi ^j\Vert +h^{k+1}+h_\alpha ^2\right) +C(h^{k+1}+h_\alpha ^2). \end{aligned}$$
(5.12)

By Lemma 3.3, we get \(\Vert \nabla \varphi ^j\Vert \le C\sum _{s=1}^q h_\alpha \omega (\alpha _s)t_j^{-\alpha _s} N^{-\min \{2-\alpha _s,r\sigma \}}\). Substituting this inequality into (5.12) and invoking Corollaries 5.1 and 5.2 yields

$$\begin{aligned} \Vert \nabla \zeta ^n\Vert&\le \Vert \nabla \zeta ^0\Vert +\frac{C}{ d _{n,1}}\sum _{j=1}^{n}\left( \sum _{s=1}^q h_\alpha \omega (\alpha _s)t_j^{-\alpha _s} N^{-\min \{2-\alpha _s,r\sigma \}}\right) \theta _{n,j}\\&\quad +\frac{C}{d _{n,1}}\sum _{j=1}^{n}\theta _{n,j}\left( h^{k+1}+h_\alpha ^2\right) +C(h^{k+1}+h_\alpha ^2)\\&\le CN^{-\min \{2-\alpha _q,r\sigma \}}\frac{1}{ d _{n,1}}\sum _{j=1}^{n}\left( \sum _{s=1}^q h_\alpha \omega (\alpha _s)t_j^{-\alpha _s} \right) \theta _{n,j}\\&\quad +\frac{C}{d _{n,1}}\sum _{j=1}^{n}\theta _{n,j}\left( h^{k+1}+h_\alpha ^2\right) +C(h^{k+1}+h_\alpha ^2)\\&\le \frac{e^r\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}C\left( N^{-\min \{2-\alpha _q,r\sigma \}}+h^{k+1}+h_\alpha ^2\right) , \end{aligned}$$

where we used \(\Vert \nabla \zeta ^0\Vert =\Vert \nabla (R_h u^0-u_h^0)\Vert =0\). Combining this bound and (2.3) with (4.3), we get (5.8). \(\square \)

Remark 5.1

No blow up appear in the error estimate given in Theorem 5.2 as \(\beta \rightarrow 1^-\), unlike the bound in Theorem 4.1 and [32, Theorem 3.1].

Corollary 5.3

Assume the hypotheses of Theorem 5.2 are satisfied. Then

$$\begin{aligned} \Vert u^n-u^n_h\Vert \le \frac{C\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\left( N^{-\min \{2-\alpha _q,r\sigma \}}+h^{k+1}+h_\alpha ^2\right) . \end{aligned}$$
(5.13)

Proof

Applying Poincare inequality, one has \(\Vert R_hu^n-u^n_h\Vert \le C\Vert \nabla R_hu^n-\nabla u^n_h\Vert \). Combining this result with (5.9) and (2.3), we get (5.13). \(\square \)

6 Superconvergence Analysis

In this section, a superconvergent result for the distributed order time-fractional diffusion equation (1.1) in two-dimensions will be presented, where the bilinear element \((k=1)\) is used in our finite element space.

To obtain the superconvergent result, we will introduce two operators as follows. Let \(\pi _h:H^2(\varOmega )\rightarrow V_{0h}\) be the interpolation operator satisfying \(\pi _hv(a_i)=v(a_i)\), where \(a_i, (i=1,2,3,4)\) are the four vertices of \(K_m\). By [37, (8)], we get the \(H^1\)-norm estimate

$$\begin{aligned} \Vert R_hw-\pi _hw\Vert _1\le Ch^2\Vert w\Vert _3, ~~\forall ~w\in H_0^1(\varOmega )\cap H^3(\varOmega ), \end{aligned}$$
(6.1)

which will play an important role in the superclose and superconvergence analysis.

Now we adopt the interpolation postprocessing operator \(\pi _{2h}\) as the same in [28], which satisfies

$$\begin{aligned}&\pi _{2h}\pi _hw=\pi _{2h}w,~~\forall ~w\in H^2(\varOmega ), \end{aligned}$$
(6.2a)
$$\begin{aligned}&\Vert w-\pi _{2h}w\Vert _1\le Ch^2|w|_3, ~~\forall ~w\in H^3(\varOmega ),\end{aligned}$$
(6.2b)
$$\begin{aligned}&\Vert \pi _{2h}w_h\Vert _1\le C\Vert w_h\Vert _1, ~~\forall ~w_h\in V_{0h}. \end{aligned}$$
(6.2c)

Next we will state the global superconvergence result.

Corollary 6.1

Assume the hypotheses of Theorem 5.2 are satisfied with \(p=3\). Using bilinear element \((k=1)\) in our finite element space. If the domain \(\varOmega \) is rectangular with sides parallel to the coordinate axes, one has

$$\begin{aligned} \Vert \nabla \pi _hu^n-\nabla u_h^n\Vert&\le \frac{C\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\left( N^{-\min \{2-\alpha _q,r\sigma \}}+h^{2}+h_\alpha ^2\right) , \end{aligned}$$
(6.3)
$$\begin{aligned} \Vert \nabla u^n-\nabla \pi _{2h}u_h^n\Vert&\le \frac{C\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\left( N^{-\min \{2-\alpha _q,r\sigma \}}+h^{2}+h_\alpha ^2\right) . \end{aligned}$$
(6.4)

Proof

By triangle inequality, we arrive at \(\Vert \nabla \pi _hu^n-\nabla u_h^n\Vert \le \Vert \nabla \pi _hu^n-\nabla R_hu^n\Vert +\Vert \nabla R_hu^n-\nabla u_h^n\Vert \). Then the bound (6.3) follows from (6.1) and (5.9) immediately.

Furthermore, combining the bound (6.3) with (6.2) yields

$$\begin{aligned} \Vert \nabla u^n-\nabla \pi _{2h}u_h^n\Vert&\le \Vert \nabla u^n-\nabla \pi _{2h}\pi _hu^n\Vert +\Vert \nabla \pi _{2h}\pi _hu^n-\nabla \pi _{2h}u_h^n\Vert \\&=\Vert \nabla u^n-\nabla \pi _{2h}u^n\Vert +\Vert \nabla \pi _{2h}(\pi _hu^n-u_h^n)\Vert \\&\le Ch^2+C\Vert \nabla \pi _hu^n-\nabla u_h^n\Vert \\&\le \frac{C\max _{1\le s\le q} \varGamma (1+l_N-\alpha _s)}{\varGamma (1+l_N)}\left( N^{-\min \{2-\alpha _q,r\sigma \}}+h^{2}+h_\alpha ^2\right) . \end{aligned}$$

Thus the proof is complete. \(\square \)

Remark 6.1

Under the same condition of Corollary 6.1, the spatial \(H^1\)-norm error of exact solution and numerical solution given in (5.8) reaches O(h) convergence for the bilinear element in space (note that the degree of the polynomial is \(k=1\) ). However, according to the superconvergent result (6.4), the \(H^1\)-norm error of the exact solution and numerical solution after postprocessing is improved to \(O(h^2)\) convergence.

7 Numerical Experiments

In this section we will present some numerical results for the initial-boundary value problem (1.1) whose solution mimic the behaviour described in (1.3) with \(\sigma =\beta \).

Define the errors \(E_0^{M,N}\), \(E_1^{M,N}\), and \(E_2^{M,N}\) for the computed solutions by

$$\begin{aligned} E_0^{M,N}&:=\max _{0\le n\le N}\Vert \nabla u^n-\nabla u_h^n\Vert ,\\ E_1^{M,N}&:=\max _{0\le n\le N}\Vert \nabla \pi _hu^n-u_h^n\Vert ,~~ E_2^{M,N}:=\max _{0\le n\le N}\Vert \nabla u^n-\nabla \pi _{2h} u_h^n\Vert . \end{aligned}$$

Example 7.1

Consider the problem (1.1) with \(\kappa =0.1\), \(T=1\), \(\omega (\alpha )=\varGamma (\beta +1-\alpha )/\varGamma (1+\beta )\), \(\varOmega =(0,1)\times (0,1)\), and the function f is chosen such that the exact solution of this problem is \(u(x,y,t)= t^\beta \sin (\pi x)\sin (\pi y)\).

To solve Example 7.1 numerically, a uniform rectangular partition of \(\varOmega \) with \(M+1\) nodes in each spatial direction and the bilinear polynomial in spatial are used. By taking \(r=(2-\beta )/\beta \), one obtains the optimal rates of convergence in Theorem 5.2 and Corollary 6.1, viz., \(O\left( h+h_\alpha ^2+N^{-(2-\beta )}\right) \) for \(\Vert \nabla u^n- \nabla u^n_h\Vert \) and \(O\left( h^{2}+h_\alpha ^2+N^{-(2-\beta )}\right) \) for \(\Vert \nabla \pi _h u^n-\nabla u_h^n\Vert \) and \(\Vert \nabla u^n-\nabla \pi _{2h}u_h^n\Vert \).

Firstly, we verify the temporal accuracy of our fully discrete L1 FEM (3.9). Table 1 shows the \(E_0^{M,N}\) errors for \(\beta =0.3, 0.5, 0.7\). Here \(M=\lceil N^{2-\beta }\rceil \) and \(q=100\) are taken so that the temporal error dominates the result. The orders of convergence displayed indicate that the rate of convergence is \(N^{-(2-\beta )}\), as predicted by (5.8) of Theorem 5.2. Tables 2 and 3 display the \(E_1^{M,N}\) and \(E_2^{M,N}\) error and their associated orders of convergence for \(\beta =0.3, 0.5, 0.7\), with \(M=\lceil N^{1-\beta /2}\rceil \) and \(q=100\) so that the temporal error dominates the distributed variable error and the spatial error. The orders of convergence displayed indicate that the rate of convergence is \(N^{-(2-\beta )}\), as predicted by Corollary 6.1.

Next we test the accuracy in spatial direction. Table 4 shows the \(E_0^{M,N}\), \(E_1^{M,N}\), and \(E_2^{M,N}\) errors and their associated orders of convergence for \(\beta =0.3, 0.5, 0.7\). Here \(N=200\) and \(q=100\) are taken so that the spatial error dominates the results. We observe O(h) convergence for \(E_0^{M,N}\) and \(O(h^2)\) convergence for \(E_1^{M,N}\) and \(E_2^{M,N}\), again as predicted by Theorem  5.2 and Corollary  6.1.

At last we check the convergence order for distributed variable. Table 5 shows the \(E_1^{M,N}\) error and the associated order of convergence for \(\beta =0.3,0.5,0.7\), where \(N=1000\) and \(M=200\) are taken to eliminate the temporal error and the spatial error.

These numerical results demonstrate the sharpness of our theoretical convergence bounds in Theorem  5.2 and Corollary  6.1.

Table 1 Example 7.1: \(E_0^{M,N}\) errors and rates of convergence in temporal direction
Table 2 Example 7.1: \(E_1^{M,N}\) errors and rates of convergence in temporal direction
Table 3 Example 7.1: \(E_2^{M,N}\) errors and rates of convergence in temporal direction
Table 4 Example 7.1: \(E_0^{M,N}\), \(E_1^{M,N}\), and \(E_2^{M,N}\) errors and convergence rates in spatial direction
Table 5 Example 7.1: \(E_1^{M,N}\) errors and convergence rates for the distributed variable

Example 7.2

Consider the problem (1.1) with \(\kappa =0.1\), \(T=1\), \(\omega (\alpha )=\varGamma (\beta +1-\alpha )/\varGamma (1+\beta )\), \(\varOmega =(0,1)\times (0,1)\), and \(\phi (x,y)=x^{2.5}(x-1)y^{2.5}(y-1)\). The function f is chosen such that the exact solution of this problem is \(u(x,y,t)= (1+t^\beta )x^{2.5}(x-1)y^{2.5}(y-1)\), which is nonsmooth in spatial direction.

In this example, we just test the convergent result of \(E_1^{M,N}\) and \(E_2^{M,N}\), and the selection of MN,  and q is same as Example 7.1. Tables 6 and 7 show that \(E_1^{M,N}\) and \(E_2^{M,N}\) have the global truncation error \(O(N^{-(2-\beta )})\) in temporal direction. Table 8 shows that \(O(h^2)\) convergence for \(E_1^{M,N}\) and \(E_2^{M,N}\) in spatial direction is observed.

Table 6 Example 7.2: \(E_1^{M,N}\) errors and rates of convergence in temporal direction
Table 7 Example 7.2: \(E_2^{M,N}\) errors and rates of convergence in temporal direction
Table 8 Example 7.2: \(\beta =0.3\); \(E_1^{M,N}\), \(E_2^{M,N}\) errors, and convergence rates in spatial direction