1 Introduction

In this article we consider an epitaxial thin film growth model. The equation is the gradient flow associated with the following energy functional

$$\begin{aligned} E(u) \equiv \int _{\varOmega }\left( - \frac{1}{2} \ln (1+|\nabla u|^2) +\frac{\varepsilon ^2}{2} |\varDelta u|^2\right) \text{ d } \mathbf x , \end{aligned}$$
(1)

where \(\varOmega \subset {\mathbb {R}}^2\) is a bounded domain, \(\varepsilon \) is a positive constant, and \(u :\varOmega \rightarrow {\mathbb {R}}\) is the height function. And also, we denote \(\partial _n u\) the exterior normal derivative. In order to eliminate the boundary integral terms in the variational derivative of the energy, the following natural boundary conditions are considered:

$$\begin{aligned} \partial _n u = \partial _n\varDelta u = 0, \quad \text{ on }~\partial \varOmega \times (0,T). \end{aligned}$$
(2)

Then we have the following variational derivative of the energy

$$\begin{aligned} \delta _{u}E = \nabla \cdot \left( \frac{\nabla u}{1+|\nabla u|^2}\right) + \varepsilon ^2 \varDelta ^2 u. \end{aligned}$$
(3)

Herein we consider the \(L^2\) gradient flow

$$\begin{aligned} \partial _tu = -\delta _{u}E = -\nabla \cdot \left( \frac{\nabla u}{1+|\nabla u|^2}\right) - \varepsilon ^2 \varDelta ^2 u, \end{aligned}$$
(4)

with the following initial conditions

$$\begin{aligned} \begin{array}{l} u(\mathbf x ,0)=u_0(\mathbf x ),\quad \text{ in }~\varOmega , \\ (u_0,1) =0, \end{array} \end{aligned}$$
(5)

where \((\cdot ,\cdot )\) represents the \(L^2\) inner product. We refer to (4) as the no-slope-selection equation, following most other references. Now we introduce \(w=-\varDelta u\) and Eq. (4) can be rewritten as

$$\begin{aligned} \partial _t u+\nabla \cdot \left( \frac{\nabla u}{1+{|\nabla u|}^2}\right) -\varepsilon ^2\varDelta w&= 0,\quad \text {in}~\varOmega \times (0,T], \end{aligned}$$
(6)
$$\begin{aligned} w+\varDelta u&= 0, \quad \text {in}~\varOmega \times (0,T]. \end{aligned}$$
(7)

From (2) and (5), we have the following boundary and initial conditions

$$\begin{aligned}&\partial _nu = \partial _n w = 0, \quad \text {on}~\partial \varOmega \times (0,T], \end{aligned}$$
(8)
$$\begin{aligned}&u(\cdot ,\mathbf x ) = u_0(\mathbf x ), \quad \text {in}~\varOmega . \end{aligned}$$
(9)
$$\begin{aligned}&(u,1) =(w,1)=0. \end{aligned}$$
(10)

Meanwhile, we see that the energy functional can be written as

$$\begin{aligned} E(u,w) \equiv \int _{\varOmega }\left( - \frac{1}{2} \ln (1+|\nabla u|^2) +\frac{\varepsilon ^2}{2} |w|^2\right) \text{ d }{} \mathbf x . \end{aligned}$$
(11)

Also, note that the first term in the energy functional represents the Ehrlich–Schwoebel (ES) effect [14], therefore, we denote it by \(E^{ES}(u)\):

$$\begin{aligned} E^{ES}(u)\equiv \int _{\varOmega } -\frac{1}{2} \ln \left( 1+|\nabla u|^2\right) ~\text{ d }{} \mathbf x . \end{aligned}$$

In [15], the global in time well-posedness for two nonlinear models of epitaxial thin film growth, with or without slope selection, was established. And also, the gradient bound and the energy asymptotic law for the epitaxial growth equation with or without slope selection have been studied in [13, 16], as \(\varepsilon \rightarrow 0\). In addition, the large-system asymptotic form of the minimum energy and the magnitude of gradients of energy-minimizing surfaces for epitaxial growth models are analysed in [14], with infinite or finite Ehrlich–Schwoebel (ES) barrier. Specially, for the case of a finite ES effect (corresponding to the model in this article), the well-posedness of the initial-boundary-value problem is proved and the bounds for the scaling laws of interface width, surface slope and energy are obtained.

In terms of the numerical simulations for the epitaxial thin film growth model, there have been many efforts to devise and analyse schemes for both the slope selection and no-slope selection equations; see the related references [6, 20, 23, 27], etc. In particular, the numerical schemes with high order accuracy and energy stability have attracted a great deal of attentions, due to the long time nature of the gradient flow coarsening process. In the paper of Li and Liu [15], a classical second order accurate semi-implicit numerical scheme, combined with Galerkin spectral approximation in space, is used for solving the equations, while a theoretical justification of the numerical energy stability is not available. Among the energy stable numerical approaches, the idea of convex splitting is worthy of discussion. For the epitaxial thin film growth models, the first such work was reported in [26], in which the authors studied unconditionally energy stable schemes, based on the convex–concave decomposition of the energy, motivated by Eyre’s pioneering work [7]. Also see the other related works on the energy stable schemes for MBE models [11, 17, 19, 21, 22, 31], etc. Meanwhile, there are two obvious shortcomings of the schemes reported in [26]: only first order accuracy (in time), and the high degree of nonlinearity of the numerical scheme, due to the implicit treatment of the nonlinear term. In particular, a direct convex splitting solver for the no-slope selection equation (4) is even more challenging, since the nonlinear term appears in the denominator part. Subsequently, an efficient linear, unconditionally stable, unconditionally solvable scheme was proposed in [3] for (4), to overcome this prominent difficulty, based on an alternate, and more advantageous, way of the convex–concave energy decomposition; such an alternate decomposition places the nonlinear part of the chemical potential in the concave part instead of the convex part. In turn, the implicit part of the scheme is completely linear, which greatly improves the numerical efficiency, in comparison with the one in [26]. In fact, such a linear stabilization approach has been reported in an earlier work [27], and a theoretical justification of this linear stability has been available in a more recent work [17]. Moreover, the linear operator involved in the scheme, which is positive elliptic with constant coefficients, can be efficiently inverted by FFT.

Of course, the linear scheme reported in [3] is only first-order accurate in time. Many efforts have been devoted in the past few years to develop second order accurate, energy stable schemes for epitaxial thin film growth models, with or without slope selection. For example, the second-order convex splitting scheme, in the modified Crank–Nicolson version, is proposed and discussed in [25]. A careful analysis indicates the unconditional energy stability and unique nonlinear solvability. An alternate second order accurate scheme, in the backward differentiation formula (BDF) version, has been proposed and analysed in a more recent work [8]. On the other hand, it is noted that, these second order schemes are highly nonlinear, and the numerical implementations become highly challenging. For the slope-selection model, the nonlinear term keeps in the polynomial format so that either a nonlinear conjugate gradient or preconditioned steepest descent (PSD) solver can be efficiently applied. However, for the no-slope-selection model(4), the numerical difficulty associated with the high degree of nonlinearity is much more prominent, due to the complicated terms appearing in the fractional quotients. To overcome this difficulty, a linear iteration solver is proposed in [5] to implement the highly nonlinear second order numerical scheme associated with the no-slope-selection equation (4). In more details, a second order accurate \(O (\tau ^2)\) artificial diffusion term, in the form of Douglas–Dupont regularization, is introduced to the numerical scheme, and a linear iteration algorithm is proposed to implement the highly nonlinear scheme in [25], associated with the no-slope-selection model. As a result, the highly nonlinear numerical scheme can be very efficiently solved by a linear iteration algorithm, and a geometric convergence order is assured for this linear iteration under a constraint associated with the artificial diffusion coefficient.

Meanwhile, it is observed that, although only a linear equation is needed at each iteration stage in the linear algorithm proposed in [5], the overall computational cost is still a few times of a linear equation at each time step, implied by the geometric convergence. Subsequently, a question naturally arises: could one derive a second order accurate, energy stable numerical scheme for the no-slope-selection model (4), with only one linear equation (with constant coefficients) involved at each time step? In this article, we propose and analyse such a numerical scheme. In more details, a second order backward differentiation formula (BDF) is applied to approximate the temporal derivative, while the surface diffusion term is treated implicitly. On the other hand, the nonlinear chemical potential is approximated by an explicit extrapolation formula at time step \(t^{n+1}\), with second order temporal accuracy. Moreover, a second order accurate \(O (\tau ^2)\) artificial term, \(A \tau \varDelta (u^n - u^{n-1})\), is added in the numerical scheme for the sake of stability analysis. In turn, the numerical scheme is linear, with constant coefficients, at each time step, due to the explicit extrapolation approach used in the nonlinear term. Furthermore, a careful energy estimate indicates that, the energy stability could be justified for the proposed numerical scheme at a theoretical level, provided that the given constant \(A \ge \frac{25}{16}\). Therefore, all the desired properties have been established for the proposed numerical scheme.

A mixed finite element approximation is taken in space, based on a mixed weak formulation of the no-slope-selection model (6)–(7). In this approach, the numerical solutions for both the phase variable u and the chemical potential variable w belong to the same finite element space \(X_h\), which is a piecewise polynomial subspace of \(H^1\). In combination with the second order temporal approximation, the resulting numerical scheme preserves the properties of unique solvability and unconditional energy stability. With a help of this uniform-in-time energy bound, we are able to establish the convergence analysis, with an error estimate of \(O (\tau ^2 + h^q)\) accuracy in the \(\ell ^\infty (0,T; L^2) \cap \ell ^2 (0,T; H_h^2)\) norm. Since the nonlinear term and all its derivatives have a direct \(\ell ^\infty \) bound, this convergence is unconditional; no scaling law is needed between \(\tau \) and h to ensure its validity.

Moreover, it is observed that, the \(h^q\) convergence order in space is not optimal; such a loss of accuracy comes from the gradient structure of the no-slope-selection equation (4). Meanwhile, there have been many articles obtaining spatial full order convergence for fourth order elliptic equations on regular rectangular meshes, such as [18, 28]. Based on these preliminary estimates, we are able to apply similar techniques and obtain an optimal spatial convergence order for the proposed numerical scheme, using regular rectangular mesh.

The rest of the article is organized as follows. In Sect. 2, we present the fully discrete scheme. The unique solvability and unconditional long time energy stability are proved in Sects. 2.1 and 2.2, respectively, and the \(O (\tau ^2 + h^q)\) convergence analysis is presented in Sect. 2.3. Subsequently, the optimal convergence analysis is provided in Sect. 3. Besides, some numerical results are presented in Sect. 4. Finally, the concluding remarks are given in Sect. 5.

2 The Numerical Scheme

Referring to [1], we denote the standard norms for the Sobolev spaces \(W^{m,p}(\varOmega )\) by \(\Vert \cdot \Vert _{m,p}\). Let \(H^m(\varOmega )\) denote \(W^{m,2}(\varOmega )\). We replace \(\Vert \cdot \Vert _{q,2}\) by \(\Vert \cdot \Vert _q\), and \(\Vert \cdot \Vert _{L^2}\) by \(\Vert \cdot \Vert \). Also, \(L_0^2(\varOmega ) \equiv \left\{ \varphi \in L^2(\varOmega ) ~|~ (\varphi ,1)=0 \right\} \). Now we introduce the Sobolev space \(X=\{v\in H^q(\varOmega ) ~|~ (v,1)=0\}\).

We denote by \(L^2(0,T;X)\) the set of all the quadratic integrable functions from [0, T] to X. Similarly we use notation \(L^2(0,T;H^{-q}({\varOmega }))\), where \(H^{-q}\) is the dual space of \(H^q\). Then the weak form of (6)–(7) is to find u and \(w\in L^2(0,T;X)\), with \(u_t \in L^2(0,T;H^{-q}({\varOmega }))\), satisfying

$$\begin{aligned}&(\partial _t u,v)+\varepsilon ^2(\nabla w,\nabla v)-\left( \frac{\nabla u}{1+|\nabla u|^2},\nabla v\right) =0,\quad \forall ~ v\in X. \end{aligned}$$
(12)
$$\begin{aligned}&(w,\psi )-(\nabla u,\nabla \psi )=0,\quad \forall ~ \psi \in X. \end{aligned}$$
(13)

Taking \(v=\partial _t u\) in (12), \(\psi =\varepsilon ^2\partial _t w\) in (13) and adding up the two equations, we have

$$\begin{aligned} \Vert \partial _t u\Vert ^2+\frac{\varepsilon ^2}{2}\frac{\text{ d }}{\text{ d }t}\Vert w\Vert ^2-\frac{1}{2}\frac{\text{ d }}{\text{ d }t}\left( \ln \left( 1+|\nabla u|^2\right) ,1\right) =0. \end{aligned}$$
(14)

Integrating (14) from \(t_0\) to \(t_1\) for any \(0\le t_0< t_1\), we have

$$\begin{aligned} \int ^{t_1}_{t_0}\Vert \partial _tu(s)\Vert ^2~\text{ d }s+E(u(t_1),w(t_1))=E(u(t_0),w(t_0)) , \end{aligned}$$
(15)

which shows that the system (6)–(7) is energy stable.

Let \(\tau =\frac{T}{N}\) be time step size. As for the mesh \({\mathcal {T}}_h=\{K\}\) on \(\varOmega \) with related finite function space \({\mathcal {P}}\), we let either \({\mathcal {T}}_h\) be a quasi-uniform triangulation with \({\mathcal {P}}={\mathcal {P}}_q(K)\) or \({\mathcal {T}}_h\) be a regular rectangular mesh with \({\mathcal {P}}= Q_q\), where h stands for a discretization parameter, \(Q_q \equiv \text {span}\{x^iy^j: 0\le i,j\le q\}\) and \({\mathcal {P}}_q(K)\) is the set of polynomials of degree \(\le q\). Define the piecewise polynomial space \(X_h\equiv \{v\in X\cap C^0(\varOmega )\mid v|_K\in {\mathcal {P}}(K),\forall ~K\in {\mathcal {T}}_h\}\subset X \).

In order to set the initialization step of our scheme, we introduce the Ritz projection \(R_h: L_0^2(\varOmega )\rightarrow X_h\),

$$\begin{aligned}&(\nabla (R_h\varphi -\varphi ),\nabla \chi )=0,\quad (R_h\varphi -\varphi ,1)=0, \quad \forall ~\chi \in X_h. \end{aligned}$$
(16)

Now we propose the fully discrete numerical scheme. Let \(u_h^0 = R_h u_0\), \(w_h^0 = R_h(-\varDelta u_0)\). Denote the numerical solution of u and w at time \(t_n\) by \(u_h^n\) and \(w_h^n\) respectively. The initialization step is defined as below: given \(u_h^0\), \(w_h^0\), find \(u_h^1, w_h^1\in X_h\), such that for any \(v_h\) and \(\psi _h \in X_h\),

$$\begin{aligned}&\left( \frac{u_h^1-u_h^0}{\tau }, v_h\right) +\varepsilon ^2\left( \nabla w_h^1,\nabla v_h\right) -\left( \frac{\nabla u_h^0}{1+|\nabla u_h^0|^2},\nabla v_h\right) {+} A^{(0)} \left( \nabla \left( u_h^1 {-} u_h^0\right) , \nabla v_h \right) {=}0, \end{aligned}$$
(17)
$$\begin{aligned}&\left( w_h^1,\psi _h\right) - \left( \nabla u_h^1,\nabla \psi _h\right) =0. \end{aligned}$$
(18)

The unique solvability comes from the positive-definite property of the involved linear operators. In addition, an unconditional energy stability, \(E(u_h^1, w_h^1) \le E (u_h^0, w_h^0)\), follows a similar analysis as given by [3], provided that \(A^{(0)} \ge 1\). And also, the first order temporal accuracy in the first time step does not affect the overall second order accuracy, which will be analysed in later sections.

For \(n\ge 1\), given \(u_h^{n-1}\), \(u_h^n\) and \(w_h^n\in X_h\), find \(u_h^{n+1}\), \(w_h^{n+1} \in X_h\), such that for any \(v_h\) and \(\psi _h \in X_h\),

$$\begin{aligned}&\left( \frac{3u_h^{n+1}-4u_h^n+u_h^{n-1}}{2\tau },v_h\right) +\varepsilon ^2\left( \nabla w_h^{n+1},\nabla v_h\right) \nonumber \\&\quad -\left( \frac{\nabla \left( 2u_h^n-u_h^{n-1}\right) }{1+\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| ^2}, \nabla v_h\right) +A\tau \left( \nabla \left( w_h^{n+1}-w_h^n\right) ,\nabla v_h\right) =0, \end{aligned}$$
(19)
$$\begin{aligned}&\quad \left( w_h^{n+1},\psi _h\right) -\left( \nabla u_h^{n+1},\nabla \psi _h\right) =0. \end{aligned}$$
(20)

Note that we have used explicit extrapolation formula for the nonlinear term, and an artificial term \(A \tau (\nabla (w_h^{n+1}-w_h^n), \nabla v_h)\) is added in the numerical scheme. In turn, the unconditional unique solvability is assured by the fact that, all the implicit terms are associated with linear elliptic operators with positive eigenvalues.

The idea of the modified BDF method has been similarly applied to the Cahn–Hilliard equation [29] and the slope-selection (SS) epitaxial thin film model [8]. Meanwhile, for the NSS model, such a BDF approach has a very different feature from these existing works, in terms of the fully explicit treatment of the nonlinear term, in comparison with the fully implicit ones in [8, 29]. This explicit treatment does not cause any stability difficulty, due to a subtle feature of the nonlinearity in the NSS model, as demonstrated in the analysis below.

2.1 Unique Solvability

The unique solvability analysis is available.

Theorem 1

The scheme (19)–(20) is unconditionally uniquely solvable.

Proof

We rewrite scheme (19) and (20) as below:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{3}{2\tau }\left( u_h^{n+1},v_h\right) +\varepsilon ^2 \left( \nabla w_h^{n+1},\nabla v_h\right) +A\tau \left( \nabla w_h^{n+1},\nabla v_h\right) =f^{n, n-1}(v_h),\\ \left( w_h^{n+1},\psi _h\right) -\left( \nabla u_h^{n+1},\nabla \psi _h\right) =0,\\ \end{array} \right. \end{aligned}$$
(21)

with

$$\begin{aligned} f^{n, n-1}(v_h)&\equiv \frac{1}{2\tau }\left( 4u_h^n-u_h^{n-1},v_h\right) + \left( \frac{\nabla \left( 2u_h^n-u_h^{n-1}\right) }{1+\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| ^2},\nabla v_h\right) \\&\quad +A\tau \left( \nabla w_h^n,\nabla v_h\right) . \end{aligned}$$

It is clear that \(f^{n, n-1}(v_h)\) is a continuous linear functional of \(v_h\). Let \(\mathbf u =[u_h^{n+1},w_h^{n+1}]\) and \(\mathbf q =[\psi _h,v_h]\), and we define the bilinear form

$$\begin{aligned} a(\mathbf u ,\mathbf q )=&\frac{3}{2\tau }\left( u_h^{n+1},v_h\right) +\varepsilon ^2 \left( \nabla w_h^{n+1},\nabla v_h\right) +A\tau \left( \nabla w_h^{n+1},\nabla v_h\right) \\&+\left( w_h^{n+1},\psi _h\right) -\left( \nabla u_h^{n+1},\nabla \psi _h\right) . \end{aligned}$$

Thus, Eq. (21) is equivalent to finding \(\mathbf u \in X_h\times X_h\) such that

$$\begin{aligned} a(\mathbf u ,\mathbf q ) = f(\mathbf q ),~~\forall ~\mathbf q \in X_h\times X_h, \end{aligned}$$

where \( f(\mathbf q )=f^{n, n-1}(v_h)\) is a continuous linear functional on \(X_h\times X_h\).

It is easy to verify that \(|a(\mathbf u ,\mathbf q )|\) is bounded above by \(C_{\varepsilon ,\tau }\Vert \mathbf u \Vert _1\Vert \mathbf q \Vert _1\), where \(C_{\varepsilon , \tau }\) is a positive constant that only depends on \(\varepsilon , \tau \). Thus we only need to prove the coercive requirement. Notice that \((w_h^{n+1},u_h^{n+1})=\Vert \nabla u_h^{n+1}\Vert ^2\), then for any \(\tau >0\),

$$\begin{aligned} a(\mathbf u ,\mathbf u ) = ~&\frac{3}{2\tau }\left( u_h^{n+1}, w_h^{n+1}\right) +\left( \varepsilon ^2+A\tau \right) \left\| \nabla w_h^{n+1}\right\| ^2+\left( w_h^{n+1},u_h^{n+1}\right) -\left\| \nabla u_h^{n+1}\right\| ^2\\ = ~&\frac{3}{2\tau }\left\| \nabla u_h^{n+1}\right\| ^2+ \left( \varepsilon ^2+A\tau \right) \left\| \nabla w_h^{n+1}\right\| ^2 \\ \ge ~&c_{\varepsilon ,\tau }\left\| \mathbf u \right\| _1, \end{aligned}$$

where we have used Poincaré’s inequality in the last step, and \(c_{\varepsilon , \tau }\) is a positive constant only dependent on \(\varepsilon , \tau \). Now by the Lax-Milgram theorem in [2], (21) admits a unique solution \(\mathbf u =[u_h^{n+1},w_h^{n+1}]\in X_h\times X_h\). \(\square \)

2.2 Long Time Energy Stability

Here we define the discrete energy functional as

$$\begin{aligned} \begin{aligned} \tilde{E}\left( u_h^{n+1},u_h^n,w_h^{n+1}\right)&\equiv \int _{\varOmega } -\frac{1}{2} \ln \left( 1+\left| \nabla u_h^{n+1}\right| ^2\right) \,\text{ d }{} \mathbf x +\frac{\varepsilon ^2}{2}\left\| w_h^{n+1}\right\| ^2\\&\quad +\frac{1}{4\tau }\left\| u_h^{n+1}-u_h^n\right\| ^2+\left\| \nabla \left( u_h^{n+1}-u_h^n\right) \right\| ^2, \quad n\ge 0. \end{aligned} \end{aligned}$$
(22)

Now we have the following energy stability estimate.

Theorem 2

Given \(A\ge \frac{25}{16}\), the second-order numerical scheme (19)–(20) has the energy-decay property

$$\begin{aligned} \tilde{E}\left( u_h^{n+1},u_h^n,w_h^{n+1}\right) \le \tilde{E}\left( u_h^n,u_h^{n-1},w_h^n\right) ,\quad n\ge 1. \end{aligned}$$
(23)

Proof

Firstly we deduce from (18) and (20) that: for \(n\ge 0\),

$$\begin{aligned} \left( w_h^{n+1}-w_h^n,\psi _h\right) -\left( \nabla \left( u_h^{n+1}-u_h^n\right) ,\nabla \psi _h\right) =0,~~\forall ~\psi _h\in X_h. \end{aligned}$$
(24)

Taking \(v_h = u_h^{n+1}-u_h^n\) in (19), \(\psi _h = \varepsilon ^2 w_h^{n+1}\) in (24) and adding them up,

$$\begin{aligned} \begin{aligned} 0&= \left( \frac{3u_h^{n+1}-4u_h^n+u_h^{n-1}}{2\tau },u_h^{n+1}-u_h^n \right) +A\tau \left( \nabla \left( w_h^{n+1}-w_h^n\right) , \nabla \left( u_h^{n+1}-u_h^n\right) \right) \\&\quad +\left( w_h^{n+1}-w_h^n,\varepsilon ^2 w_h^{n+1}\right) -\left( \frac{\nabla \left( 2u_h^n-u_h^{n-1}\right) }{1 +\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| ^2},\nabla \left( u_h^{n+1}-u_h^n\right) \right) . \end{aligned} \end{aligned}$$
(25)

Below we estimate the four terms on the right side of (25), successively. As for the first term, we apply the Cauchy–Schwarz inequality:

$$\begin{aligned}&\left( \frac{3u_h^{n+1}-4u_h^n+u_h^{n-1}}{2\tau },u_h^{n+1}-u_h^n \right) \ge \frac{1}{\tau }\left( \frac{5}{4} \left\| u_h^{n+1}-u_h^n\right\| ^2-\frac{1}{4} \left\| u_h^n-u_h^{n-1}\right\| ^2\right) . \end{aligned}$$
(26)

Taking \(\psi _h = w_h^{n+1}-w_h^n\) in (24) yields

$$\begin{aligned}&A\tau \left( \nabla \left( w_h^{n+1}-w_h^n\right) , \nabla \left( u_h^{n+1}-u_h^n\right) \right) =A\tau \left\| w_h^{n+1}-w_h^n\right\| ^2. \end{aligned}$$
(27)

The third term can be directly computed as

$$\begin{aligned} \begin{aligned} \left( w_h^{n+1}-w_h^n,\varepsilon ^2 w_h^{n+1}\right)&=\frac{\varepsilon ^2}{2} \left( \left\| w_h^{n+1}\right\| ^2- \left\| w_h^n\right\| ^2+\left\| w_h^{n+1}-w_h^n\right\| ^2\right) \\&\ge \frac{\varepsilon ^2}{2} \left( \left\| w_h^{n+1}\right\| ^2 -\left\| w_h^n\right\| ^2\right) . \end{aligned} \end{aligned}$$
(28)

As for the fourth term \((\mathrm{I})\equiv -\left( \frac{\nabla (2u_h^n-u_h^{n-1})}{1+|\nabla (2u_h^n-u_h^{n-1})|^2},\nabla (u_h^{n+1}-u_h^n)\right) \), we notice that \(\ln (1+x)\le x\) for \(x>-1\), so that

$$\begin{aligned} -\ln \left( 1+\left| \nabla u_h^{n+1}\right| ^2\right) +\ln \left( 1+|\nabla u_h^n|^2\right) =\ln \left( \frac{1+\left| \nabla u_h^n\right| ^2}{1+\left| \nabla u_h^{n+1}\right| ^2}\right) \le \frac{\left| \nabla u_h^n\right| ^2 -\left| \nabla u_h^{n+1}\right| ^2}{1+ \left| \nabla u_h^{n+1}\right| ^2}. \end{aligned}$$

Therefore, recalling that \(E^{ES}(u_h)= \int _{\varOmega } -\frac{1}{2} \ln (1+|\nabla u_h|^2) ~\text{ d }{} \mathbf x \), we get

$$\begin{aligned} E^{ES}\left( u_h^{n+1}\right) -E^{ES}\left( u_h^n\right)&= \int _{\varOmega } \left( -\frac{1}{2} \ln \left( 1+\left| \nabla u_h^{n+1}\right| ^2\right) +\frac{1}{2} \ln \left( 1+\left| \nabla u_h^n\right| ^2\right) \right) \,\text{ d }{} \mathbf x \\&\le \frac{1}{2}\int _{\varOmega }\frac{\left| \nabla u_h^n\right| ^2-\left| \nabla u_h^{n+1}\right| ^2}{1+\left| \nabla u_h^{n+1}\right| ^2}\,\text{ d }{} \mathbf x \\&= - \frac{1}{2} \int _\varOmega \frac{\nabla \left( u_h^{n+1}-u_h^n\right) \cdot \nabla \left( u_h^{n+1}+u_h^n\right) }{1+\left| \nabla u_h^{n+1}\right| ^2} \,\text{ d }{} \mathbf x . \end{aligned}$$

As a consequence, the following inequality is valid:

$$\begin{aligned} \begin{aligned}&(\mathrm{I})-\left[ E^{ES}\left( u_h^{n+1}\right) -E^{ES} \left( u_h^n\right) \right] \\&\quad \ge -\left( \frac{\nabla \left( 2u_h^n-u_h^{n-1}\right) }{1+\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| ^2},\nabla \left( u_h^{n+1}-u_h^n\right) \right) \\&\qquad + \frac{1}{2} \int _\varOmega \frac{\nabla \left( u_h^{n+1}-u_h^n\right) \cdot \nabla \left( u_h^{n+1}+u_h^n\right) }{1+\left| \nabla u_h^{n+1}\right| ^2} \,\text{ d }{} \mathbf x \\&\quad \ge \int _{\varOmega } \Big (\frac{-\nabla \Big (u_h^{n+1}-u_h^n\Big )\cdot \nabla \Big (2u_h^n-u_h^{n-1}\Big )}{1+\Big |\nabla \Big (2u_h^n-u_h^{n-1}\Big )\Big |^2}\\&\qquad \qquad +\frac{\frac{1}{2}\nabla \Big (u_h^{n+1}-u_h^n\Big )\cdot \nabla \Big (u_h^{n+1}+u_h^n\Big )}{1+\Big |\nabla u_h^{n+1}\Big |^2}\Big )\,\text{ d }{} \mathbf x \equiv \int _{\varOmega } (\mathrm{II}) \,\text{ d }{} \mathbf x . \end{aligned} \end{aligned}$$
(29)

Next we separate the term \((\mathrm{II})\) into two parts and estimate them respectively:

$$\begin{aligned} ({\mathrm{II}})&=\Big (\frac{-\nabla \Big (u_h^{n+1}-u_h^n\Big )\cdot \nabla \Big (2u_h^n-u_h^{n-1}\Big )}{1+\left| \nabla \Big (2u_h^n-u_h^{n-1}\Big )\right| ^2} -\frac{-\nabla \Big (u_h^{n+1}-u_h^n\Big )\cdot \nabla \Big (2u_h^n-u_h^{n-1}\Big )}{1+\Big |\nabla u_h^{n+1}\Big |^2}\Big )\\&\quad \; +\Big (\frac{-\nabla \Big (u_h^{n+1}-u_h^n\Big )\cdot \nabla \Big (2u_h^n-u_h^{n-1}\Big )}{1+\Big |\nabla u_h^{n+1}\Big |^2}+\frac{\frac{1}{2}\nabla \Big (u_h^{n+1}-u_h^n\Big )\cdot \nabla \Big (u_h^{n+1}+u_h^n\Big )}{1+\Big |\nabla u_h^{n+1}\Big |^2}\Big ) \\&\equiv \mathrm{g_1}+\mathrm{g_2}. \end{aligned}$$

As for the first part,

$$\begin{aligned} \mathrm{g_1}&\ge -\left| \nabla \left( u_h^{n+1}-u_h^n\right) \right| \left| \nabla \left( u_h^{n+1}-2u_h^n+u_h^{n-1}\right) \right| \; \\&\quad \times \frac{\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| \left| \nabla \left( u_h^{n+1}+2u_h^n-u_h^{n-1}\right) \right| }{\left( 1+\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| ^2\right) \left( 1+\left| \nabla u_h^{n+1}\right| ^2\right) } \\&\ge -\left| \nabla \left( u_h^{n+1}-u_h^n\right) \right| \left| \nabla \left( u_h^{n+1}-2u_h^n+u_h^{n-1}\right) \right| \;\\&\quad \times \frac{\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| \left( \left| \nabla u_h^{n+1}\right| +\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| \right) }{\left( 1+\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| ^2\right) \left( 1+\left| \nabla u_h^{n+1}\right| ^2\right) }. \end{aligned}$$

For any real number \(a, b\ge 0\), we have \(a(a+b)< (1+a^2)(1+b^2)\). Applying this property to the inequality above, we obtain

$$\begin{aligned} \begin{aligned} \mathrm{g_1}&\ge -\left| \nabla \left( u_h^{n+1}-u_h^n\right) \right| \left| \nabla \left( u_h^{n+1}-2u_h^n+u_h^{n-1}\right) \right| \\&\ge -\left| \nabla \left( u_h^{n+1}-u_h^n\right) \right| ^2 -\left| \nabla \left( u_h^{n+1}-u_h^n\right) \right| \left| \nabla \left( u_h^n-u_h^{n-1}\right) \right| \\&\ge -\frac{3}{2}\left| \nabla \left( u_h^{n+1}-u_h^n\right) \left| ^2-\frac{1}{2}\right| \nabla \left( u_h^n-u_h^{n-1}\right) \right| ^2. \end{aligned} \end{aligned}$$
(30)

Using the Cauchy–Schwarz inequality on \(\mathrm{g_2}\) yields

$$\begin{aligned} \begin{aligned} \mathrm{g_2}&= \frac{1}{1+\left| \nabla u_h^{n+1}\right| ^2} \left[ \frac{1}{2}\left| \nabla \left( u_h^{n+1}-u_h^n\right) \right| ^2-\nabla \left( u_h^{n+1}-u_h^n\right) \cdot \nabla \left( u_h^n-u_h^{n-1}\right) \right] \\&\ge \frac{1}{1+\left| \nabla u_h^{n+1}\right| ^2} \left( -\frac{1}{2}\left| \nabla \left( u_h^n-u_h^{n-1}\right) \right| ^2\right) \\&\ge -\frac{1}{2}\left| \nabla \left( u_h^n-u_h^{n-1}\right) \right| ^2. \end{aligned} \end{aligned}$$
(31)

Substituting (30) and (31) into \(({\mathrm{II}})\), we have

$$\begin{aligned} ({\mathrm{II}}) \ge -\frac{3}{2}\left| \nabla \left( u_h^{n+1}-u_h^n\right) \right| ^2-\left| \nabla \left( u_h^n-u_h^{n-1}\right) \right| ^2. \end{aligned}$$
(32)

Therefore, (29) could be rewritten as

$$\begin{aligned} ({\mathrm{I}})-\left[ E^{ES}\left( u_h^{n+1}\right) -E^{ES}\left( u_h^n\right) \right] \ge -\frac{3}{2}\left\| \nabla \left( u_h^{n+1}-u_h^n\right) \right\| ^2 -\left\| \nabla \left( u_h^n-u_h^{n-1}\right) \right\| ^2. \end{aligned}$$
(33)

Substituting the estimates (26)–(28) and (33) for the four terms into (25), we obtain

$$\begin{aligned} \begin{aligned} 0&\ge \frac{1}{4\tau } \left( \left\| u_h^{n+1}-u_h^n\right\| ^2 -\left\| u_h^n-u_h^{n-1}\right\| ^2\right) +\frac{1}{\tau }\left\| u_h^{n+1}-u_h^n\right\| ^2+A\tau \left\| w_h^{n+1}-w_h^n\right\| ^2 \\&\quad +\frac{\varepsilon ^2}{2} \left( \left\| w_h^{n+1}\right\| ^2 -\left\| w_h^n\right\| ^2\right) +E^{ES}\left( u_h^{n+1}\right) -E^{ES}\left( u_h^n\right) -\frac{3}{2} \left\| \nabla \left( u_h^{n+1}-u_h^n\right) \right\| ^2 \\&\quad -\left\| \nabla \left( u_h^n-u_h^{n-1}\right) \right\| ^2. \end{aligned} \end{aligned}$$
(34)

Meanwhile, a combination of the Cauchy–Schwarz inequality and (24) implies that

$$\begin{aligned}&\frac{1}{\tau }\left\| u_h^{n+1}-u_h^n\right\| ^2 +A\tau \left\| w_h^{n+1}-w_h^n\right\| ^2 \\&\quad \ge 2A^{\frac{1}{2}} \left\| u_h^{n+1}-u_h^n\right\| \left\| w_h^{n+1}-w_h^n\right\| \\&\quad \ge 2A^{\frac{1}{2}}\left\| \nabla \left( u_h^{n+1}-u_h^n\right) \right\| ^2 \ge \frac{5}{2}\left\| \nabla \left( u_h^{n+1}-u_h^n\right) \right\| ^2, \end{aligned}$$

where we have used the assumption that \(A\ge \frac{25}{16}\). Therefore, we can rewrite the inequality (34) as

$$\begin{aligned} 0&\ge \frac{1}{4\tau }\left( \left\| u_h^{n+1}-u_h^n\right\| ^2 -\left\| u_h^n-u_h^{n-1}\right\| ^2\right) \\&\quad +\left( \left\| \nabla \left( u_h^{n+1}-u_h^n\right) \right\| ^2-\left\| \nabla \left( u_h^n-u_h^{n-1}\right) \right\| ^2\right) \\&\quad +\frac{\varepsilon ^2}{2} \left( \left\| w_h^{n+1}\right\| ^2-\left\| w_h^n\right\| ^2\right) +E^{ES}\left( u_h^{n+1}\right) -E^{ES}\left( u_h^n\right) \\&= \tilde{E}\left( u_h^{n+1},u_h^n,w_h^{n+1}\right) -\tilde{E}\left( u_h^n,u_h^{n-1},w_h^n\right) . \end{aligned}$$

which is the conclusion we need. \(\square \)

Remark 1

For the no-slope-selection model (4), there have been other second order numerical schemes, in which the energy stability is defined over an “alternate” energy functional; see the related works of [21, 31], etc. In these numerical approaches, the energy functional is numerically defined, and the nonlinear energy density is based on an alternate numerical variable. In turn, such an energy stability does not justify an \(H^2\) bound for the numerical solution, at a theoretical level. In comparison, the energy stability analysis derived in this section is based on the original phase variable u, so that the desired bound is available for our proposed numerical scheme. This subtle property will provide a great deal of convenience in the convergence analysis given by later sections.

Remark 2

The energy stability analysis is established in the framework of Galerkin approximation, which comes from the finite element spatial approximation. On the other hand, it is observed that, the nonlinear integral values could hardly be exactly computed in the practical computations, due to the highly complicated nature of the denominator form. Instead, a numerical approximation to these nonlinear integral values, which corresponds to the collocation approach, has to be taken into consideration.

In the case of a uniform spatial mesh, the Fourier collocation spectral scheme has been analysed in an existing work [3], and unconditional energy stability has been proved for the first order linear splitting scheme. An extension to the second order accurate linear iteration algorithm has been reported in [5]. For the second order linear scheme proposed in this article, the energy stability analysis for the collocation approximation is expected to be available, and the details will be considered in the future works.

2.3 The \(\ell ^\infty (0,T; L^2) \cap \ell ^2 (0,T; H_h^2)\) Convergence Analysis

We present the convergence analysis in this section. First of all, referring to [2], the following estimate holds for the Ritz projection \(R_h\) : \(\forall ~\varphi \in H^{q+1}(\varOmega )\cap X\),

$$\begin{aligned} \Vert \varphi -R_h\varphi \Vert +h\Vert \nabla (\varphi -R_h\varphi )\Vert \le Ch^{q+1}\Vert \varphi \Vert _{q+1}. \end{aligned}$$
(35)

The following discrete Gronwall inequality [24] is needed in the error analysis.

Lemma 1

Assume that \(\tau>0, B>0\), \(\{a_n\}, \{b_n\}, \{\gamma _n\}\) are non-negative sequences such that

$$\begin{aligned} a_m+\tau \sum ^m_{n=1} b_n\le \tau \sum ^{m-1}_{n=1}\gamma _na_n+B, ~m\ge 1. \end{aligned}$$

Then,

$$\begin{aligned} a_m+\tau \sum ^m_{n=1} b_n\le B\exp \left( \tau \sum ^{m-1}_{n=1}\gamma _n\right) , ~m\ge 1. \end{aligned}$$

Besides, we will also use the inverse estimate in [2, p.111, Lemma 4.5.3].

Lemma 2

Given a quasi-uniform triangulation \({\mathcal {T}}_h\,(h\le 1)\) on domain \(\varOmega \subset {\mathbb {R}}^n\) and the related finite dimensional function subspace \(X_h\subset W^{l,p}\cap W^{m,q}\) with \(1\le p\le \infty , 1\le q \le \infty \) and \(0\le m\le l\), there exists a constant C such that for any \(\chi \in X_h\), \(K\in {\mathcal {T}}_h\), we have

$$\begin{aligned} \Vert \chi \Vert _{W^{l,p}(K)} \le \tilde{C}h^{m-l+n/p-n/q}\Vert \chi \Vert _{W^{m,q}(K)}, \end{aligned}$$

where \(\tilde{C}\) is independent of h and \(\chi \).

The following lemma is also needed in the analysis.

Lemma 3

Given functions \(\varphi _1\in X\), \(\varphi _2\in X\) and \(v\in X\), let us define function \(\varPhi : [0,1]\rightarrow {\mathbb {R}}\)

$$\begin{aligned} \varPhi _{\varphi _1,\varphi _2, v} (s)=\left( \frac{\nabla (\varphi _1+s(\varphi _2-\varphi _1))}{1+ |\nabla (\varphi _1+s(\varphi _2-\varphi _1))|^2},\nabla v\right) , \end{aligned}$$
(36)

then we have

$$\begin{aligned} |\varPhi _{\varphi _1,\varphi _2, v} (1)-\varPhi _{\varphi _1,\varphi _2, v} (0)| \le \Vert \nabla (\varphi _2-\varphi _1)\Vert \Vert \nabla v\Vert . \end{aligned}$$

Proof

The derivative of \(\varPhi \) with respect to s is

$$\begin{aligned} \begin{aligned}&\varPhi _{\varphi _1,\varphi _2, v}' (s) \\&\quad = \left( \frac{\nabla (\varphi _2-\varphi _1)}{1+|\nabla (\varphi _1+s(\varphi _2-\varphi _1))|^2},\nabla v\right) -\left( \frac{2|\nabla (\varphi _1+s(\varphi _2-\varphi _1))|^2\nabla (\varphi _2-\varphi _1)}{(1+|\nabla (\varphi _1+s(\varphi _2-\varphi _1))|^2)^2},\nabla v\right) \\&\quad = \left( \frac{1-|\nabla (\varphi _1+s(\varphi _2-\varphi _1))|^2}{(1+|\nabla (\varphi _1+s(\varphi _2-\varphi _1))|^2)^2}\nabla (\varphi _2-\varphi _1),\nabla v\right) \end{aligned} \end{aligned}$$

Thus we have the estimate

$$\begin{aligned} |\varPhi _{\varphi _1,\varphi _2, v}' (s)| \le \Vert \nabla (\varphi _2-\varphi _1)\Vert \Vert \nabla v\Vert . \end{aligned}$$

Applying the above estimate, we have

$$\begin{aligned} |\varPhi _{\varphi _1,\varphi _2, v} (1)-\varPhi _{\varphi _1,\varphi _2, v} (0)|&\le \int _0^1 |\varPhi _{\varphi _1,\varphi _2, v}' (s)|~\text{ d }s \le \ \Vert \nabla (\varphi _2-\varphi _1)\Vert \Vert \nabla v\Vert . \end{aligned}$$

which is the conclusion we need. \(\square \)

We denote by (uw) the exact solution pair to the original equation (4), and all the upper bounds for the exact solution are denoted as \(C_0\). We say that the solution pair is in the regularity class \({\mathcal {C}}\) if and only if

$$\begin{aligned} \begin{aligned}&u \in ~L^\infty \left( 0,T;H^{q+1}\right) \cap H^1\left( 0,T; H^{q+1}\right) \cap H^2(0,T;H^1)\\&\qquad \cap W^{1,\infty }\left( 0,T;H^2\right) \cap W^{2,\infty }\left( 0,T; L^2\right) \cap H^3\left( 0,T;L^2\right) , \\&w \in ~L^\infty \left( 0,T;H^{q+1}\right) \cap H^1\left( 0,T; H^1\right) . \end{aligned} \end{aligned}$$
(37)

The following theorem is the main result of this section.

Theorem 3

Suppose that the exact solution pair (uw) is in the regularity class \({\mathcal {C}}\), for a fixed final time \(T>0\). Denote \(u(t_n)\) by \(u^n\) and let \(u^n_h\) be the solution to the fully discrete numerical scheme (17)–(20) at time \(t_m=m\tau \), for \(1\le m\le N\) with \(N\tau = T\). Assume that

$$\begin{aligned} 0<\tau< \frac{\varepsilon ^2}{2\alpha ^2},\quad 0<h\le 1, \end{aligned}$$

where \(\alpha >3\) is a constant, then we have the following error estimate

$$\begin{aligned} \left\| u^m-u^m_h\right\| +\left( \tau \sum _{n=1}^m \left\| w^n-w^n_h\right\| ^2\right) ^{\frac{1}{2}} \le C_{\varepsilon ,T} \left( h^q+\tau ^2\right) . \end{aligned}$$
(38)

Proof

The error functions are defined as

$$\begin{aligned} \begin{aligned} e_u^n&\equiv \rho _u^n+\sigma _u^n\equiv \left( u^n-R_h u^n\right) +\left( R_h u^n-u_h^n\right) =u^n-u_h^n,\\ e_w^n&\equiv \rho _w^n+\sigma _w^n\equiv \left( w^n-R_h w^n\right) +\left( R_h w^n-w_h^n\right) =w^n-w_h^n. \end{aligned} \end{aligned}$$
(39)

Subtracting the numerical scheme formulation (19)–(20) from the weak form (12)–(13), we obtain the following error equations:

$$\begin{aligned}&\left( \delta _\tau ^{n+1} e_u,v_h\right) +\varepsilon ^2\left( \nabla e_w^{n+1},\nabla v_h\right) +A\tau \left( \nabla \left( e_w^{n+1}-e_w^n\right) ,\nabla v_h\right) \\&\quad =\left( {{\mathcal {R}}}_1^{n+1},v_h \right) +A\tau \left( {\mathcal {R}}_2^{n+1},\nabla v_h\right) +\left( {\mathcal {N}}^{n+1},\nabla v_h\right) , ~\forall ~v_h\in X_h,\\&\left( e_w^{n+1},\psi _h\right) -\left( \nabla e_u^{n+1},\nabla \psi _h\right) =0, ~\forall ~\psi _h\in X_h, \end{aligned}$$

for any \(n \ge 1\), where

$$\begin{aligned}&\delta _\tau ^{n+1} v =\frac{3v^{n+1}-4v^n+v^{n-1}}{2\tau },~{\mathcal {R}}_1^{n+1} =\delta _\tau ^{n+1} u-\partial _t u^{n+1}, \\&{\mathcal {R}}_2^{n+1}=\nabla \left( w^{n+1}-w^n\right) , ~{\mathcal {N}}^{n+1}=\frac{\nabla u^{n+1}}{1+|\nabla u^{n+1}|^2}-\frac{\nabla \left( 2u_h^n-u_h^{n-1}\right) }{1+\left| \nabla \left( 2u_h^n-u_h^{n-1}\right) \right| ^2}. \end{aligned}$$

Notice that the definition (16) of \(R_h\) indicates that: \((\nabla \rho _u^{n+1},\nabla \chi )=0\), \(\forall ~\chi \in X_h\). Thus the error equations can be rewritten as: For \(n\ge 1\), and for any \(v_h\in X_h\), \(\psi _h\in X_h\),

$$\begin{aligned} \begin{aligned}&\left( \delta _\tau ^{n+1}\sigma _u,v_h\right) +\varepsilon ^2\left( \nabla \sigma _w^{n+1},\nabla v_h\right) +A\tau \left( \nabla \left( \sigma _w^{n+1}-\sigma _w^n\right) ,\nabla v_h\right) \\&\quad = -\left( \delta _\tau ^{n+1}\rho _u,v_h\right) +\left( {\mathcal {R}}_1^{n+1},v_h \right) +A\tau \left( {\mathcal {R}}_2^{n+1},\nabla v_h\right) +\left( {\mathcal {N}}^{n+1},\nabla v_h\right) , \end{aligned} \end{aligned}$$
(40)

and

$$\begin{aligned} \left( \sigma _w^{n+1},\psi _h\right) -\left( \nabla \sigma _u^{n+1},\nabla \psi _h\right) =-\left( \rho _w^{n+1},\psi _h\right) . \end{aligned}$$
(41)

With a slight modification, we obtain the scheme for the initialization step: For \(n=0\), for any \(v_h\in X_h\), \(\psi _h\in X_h\),

$$\begin{aligned}&\left( \delta _\tau ^1 \sigma _u^1,v_h\right) + \varepsilon ^2\left( \nabla \sigma _w^1,\nabla v_h\right) + A^{(0)} \left( \nabla \left( \sigma _u^1 - \sigma _u^0\right) , \nabla v_h\right) \nonumber \\&\qquad = - \left( \delta _\tau ^1 \rho _u,v_h\right) +\left( {\mathcal {R}}_1^1,v_h \right) + A^{(0)} \left( {\mathcal {R}}_2^1, \nabla v_h\right) + \left( {\mathcal {N}}^1,\nabla v_h \right) , \end{aligned}$$
(42)
$$\begin{aligned}&\left( \sigma _w^1,\psi _h\right) -\left( \nabla \sigma _u^1,\nabla \psi _h\right) = -\left( \rho _w^1,\psi _h\right) , \end{aligned}$$
(43)

where

$$\begin{aligned} \delta _\tau ^1 v \!=\!\frac{v^1-v^0}{\tau },~{\mathcal {R}}_1^1\!=\,\delta _\tau ^1 u-\partial _t u^1,~{\mathcal {R}}_2^1\!=\!\nabla \left( u^1-u^0\right) ,~{\mathcal {N}}^1\! = \!\frac{\nabla u^1}{1+|\nabla u^1|^2}-\frac{\nabla u_h^0}{1+|\nabla u_h^0|^2}. \end{aligned}$$

Now we focus on the case when \(n\ge 1\). Taking \(v_h=\sigma _u^{n+1}\) in (40), \(\psi _h=\varepsilon ^2\sigma _w^{n+1}\) in (41) and adding up the two equations lead to

$$\begin{aligned} \begin{aligned}&\left( \delta _{\tau }^{n+1}\sigma _{u},\sigma _u^{n+1} \right) +\varepsilon ^2\left\| \sigma _w^{n+1}\right\| ^2+A\tau \left( \sigma _w^{n+1}-\sigma _w^n,\sigma _w^{n+1}\right) \\&\quad = -A\tau \left( \sigma _w^{n+1}-\sigma _w^n,\rho _w^{n+1}\right) -\varepsilon ^2 \left( \rho _w^{n+1},\sigma _w^{n+1}\right) -\left( \delta _{\tau }^{n+1}\rho _{u},\sigma _u^{n+1}\right) +\left( {\mathcal {R}}_1^{n+1},\sigma _u^{n+1}\right) \\&\qquad + A\tau \left( {\mathcal {R}}_2^{n+1},\nabla \sigma _u^{n+1}\right) +\left( {\mathcal {N}}^{n+1},\nabla \sigma _u^{n+1}\right) , \end{aligned} \end{aligned}$$
(44)

where we have used the transformation

$$\begin{aligned} A\tau \left( \nabla \left( \sigma _w^{n+1}-\sigma _w^n\right) ,\nabla \sigma _u^{n+1}\right)&=A\tau \left( \nabla \left( \sigma _w^{n+1}-\sigma _w^n\right) ,\nabla e_u^{n+1}\right) \\&=A\tau \left( \sigma _w^{n+1}-\sigma _w^n,e_w^{n+1}\right) \\&=A\tau \left( \sigma _w^{n+1}-\sigma _w^n,\sigma _w^{n+1}\right) +A\tau \left( \sigma _w^{n+1}-\sigma _w^n,\rho _w^{n+1}\right) . \end{aligned}$$

First we focus on the terms on the left-hand side. In order to estimate the first term, we recall the G-norm introduced in [4]. Let \(\mathbf p ^{k+1}\equiv [\sigma _u^k,\sigma _u^{k+1}]^T\), and

$$\begin{aligned} \left\| \mathbf p ^{k+1}\right\| _\mathbf G ^2\equiv \left( \mathbf p ^{k+1}, \mathbf Gp ^{k+1}\right) ,\quad \mathbf G = \begin{pmatrix} \frac{1}{2} &{} -1 \\ -1 &{} \frac{5}{2} \end{pmatrix} . \end{aligned}$$
(45)

Applying this notation to the first term, we have

$$\begin{aligned} \left( \delta _{\tau }^{n+1}\sigma _u,\sigma _u^{n+1} \right) =\frac{1}{2\tau } \left( \left\| \mathbf p ^{n+1}\right\| _\mathbf G ^2 -\left\| \mathbf p ^n\right\| _\mathbf G ^2\right) +\frac{1}{4\tau }\left\| \sigma _u^{n+1}-2\sigma _u^n+\sigma _u^{n-1}\right\| ^2. \end{aligned}$$
(46)

The third term can be represented as

$$\begin{aligned} A\tau \left( \sigma _w^{n+1}-\sigma _w^n,\sigma _w^{n+1}\right) = \frac{A\tau }{2} \left( \left\| \sigma _w^{n+1}\right\| ^2 -\left\| \sigma _w^n\right\| ^2+ \left\| \sigma _w^{n+1} -\sigma _w^n\right\| ^2\right) . \end{aligned}$$
(47)

Now we estimate the terms on the right-hand side. As for the first two terms, applying the property of Ritz projection in (35) and the Cauchy–Schwarz inequality yields

$$\begin{aligned} \begin{aligned}&-A\tau \left( \sigma _w^{n+1}-\sigma _w^n,\rho _w^{n+1}\right) \le \frac{A\tau }{2}\left\| \sigma _w^{n+1}-\sigma _w^n\right\| ^2+C\tau h^{2(q+1)}\left\| w^{n+1}\right\| _{1+q}^2,\\&\quad -\varepsilon ^2\left( \rho _w^{n+1},\sigma _w^{n+1}\right) \le \frac{\varepsilon ^2}{2}\left\| \sigma _w^{n+1} \right\| ^2+C\varepsilon ^2h^{2(q+1)}\left\| w^{n+1}\right\| _{1+q}^2. \end{aligned} \end{aligned}$$
(48)

To analyse the third and fourth terms, we use Taylor expansion and the Cauchy–Schwarz inequality:

$$\begin{aligned} \begin{aligned} -\left( \delta _{\tau }^{n+1}\rho _u,\sigma _u^{n+1}\right)&=\left( \frac{3\rho _u^{n+1}-4\rho _u^n+\rho _u^{n-1}}{2\tau }, \sigma _u^{n+1}\right) \\&\le \frac{C_1}{2}\left\| \sigma _u^{n+1}\right\| ^2+C\frac{h^{2(q+1)}}{\tau }\int _{t_{n-1}}^{t_{n+1}} \left\| \partial _t u\right\| _{q+1}^2~\text{ d }t,\\ \left( {\mathcal {R}}_1^{n+1},\sigma _u^{n+1} \right)&= \left( \frac{3u^{n+1}-4u^n+u^{n-1}}{2\tau }-\partial _t u^{n+1},\sigma _u^{n+1}\right) \\&\le \frac{C_1}{2}\left\| \sigma _u^{n+1}\right\| ^2 +C\tau ^3\int _{t_{n-1}}^{t_{n+1}} \left\| \partial _{ttt} u\right\| ^2~\text{ d }t.\\ \end{aligned} \end{aligned}$$
(49)

Applying Taylor expansion and the Cauchy–Schwarz inequality, for the fifth term we get

$$\begin{aligned} \begin{aligned} A\tau \left( {\mathcal {R}}_2^{n+1},\nabla \sigma _u^{n+1}\right)&= A\tau \left( \nabla \left( w^{n+1}-w^n\right) ,\nabla \sigma _u^{n+1}\right) \\&\le \frac{C_2}{6}\left\| \nabla \sigma _u^{n+1}\right\| ^2+C\tau ^3\int _{t_n}^{t_{n+1}}\left\| \nabla \partial _t w\right\| ^2 ~\text{ d }t. \end{aligned} \end{aligned}$$
(50)

Notice that

$$\begin{aligned} \left\| \nabla \sigma _u^{n+1}\right\| ^2 =\left( \nabla \sigma _u^{n+1},\nabla e_u^{n+1}\right) =\left( \sigma _u^{n+1},e_w^{n+1}\right) \le \left\| \sigma _u^{n+1}\right\| \left( \left\| \sigma _w^{n+1}\right\| +\left\| \rho _w^{n+1}\right\| \right) . \end{aligned}$$

Therefore, using again the Cauchy–Schwarz inequality, we have:

$$\begin{aligned}&\frac{C_2}{6}\left\| \nabla \sigma _u^{n+1}\right\| ^2 \nonumber \\&\quad \le \frac{C_2}{6} \left\| \sigma _u^{n+1}\right\| \left( \left\| \sigma _w^{n+1}\right\| +\left\| \rho _w^{n+1}\right\| \right) \nonumber \\&\quad \le \frac{C_2}{6}\left( \frac{1}{4C_3} \left\| \sigma _u^{n+1}\right\| ^2 + 2C_3\left\| \sigma _w^{n+1}\right\| ^2 + 2C_3Ch^{2(q+1)}\left\| w^{n+1}\right\| _{1+q}^2\right) , \end{aligned}$$
(51)

which leads to

$$\begin{aligned} \begin{aligned} A\tau \left( {\mathcal {R}}_2^{n+1},\nabla \sigma _u^{n+1}\right)&\le \frac{C_2}{24C_3}\left\| \sigma _u^{n+1}\right\| ^2 +\frac{C_2C_3}{3}\left\| \sigma _w^{n+1}\right\| ^2\\&\quad +Ch^{2(q+1)}\left\| w^{n+1}\right\| _{1+q}^2 +C\tau ^3\int _{t_n}^{t_{n+1}}\left\| \nabla \partial _t w\right\| ^2 ~\text{ d }t. \end{aligned} \end{aligned}$$
(52)

As for the nonlinear term, we recall the function \(\varPhi \) defined in (36) and Lemma 3, and arrive at

$$\begin{aligned} \begin{aligned}&\left( {\mathcal {N}}^{n+1},\nabla \sigma _u^{n+1}\right) \\&\quad = \varPhi _{\left( 2u_h^n-u_h^{n-1}\right) ,u^{n+1}, \sigma _u^{n+1}} (1)-\varPhi _{\left( 2u_h^n-u_h^{n-1}\right) ,u^{n+1}, \sigma _u^{n+1}} (0)\\&\quad \le \left\| \nabla \left( u^{n+1}-2u_h^n+u_h^{n-1}\right) \right\| \left\| \nabla \sigma _u^{n+1}\right\| \\&\quad \le \Big (\Big \Vert \nabla \Big (u^{n+1}-2u^n+u^{n-1}\Big )\Big \Vert {+}2\Big \Vert \nabla \sigma _u^n\Big \Vert {+}2 \Big \Vert \nabla \rho _u^n\Big \Vert {+}\Big \Vert \nabla \sigma _u^{n-1}\Big \Vert {+}\Big \Vert \nabla \rho _u^{n-1}\Big \Vert \Big )\Big \Vert \nabla \sigma _u^{n+1}\Big \Vert . \end{aligned} \end{aligned}$$

Now we use the Cauchy–Schwarz inequality and estimate \(\Vert \nabla \sigma _u^{n+1}\Vert ^2\) as in (51):

$$\begin{aligned} \begin{aligned} \left( {\mathcal {N}}^{n+1},\nabla \sigma _u^{n+1}\right)&\le \frac{5C_2}{24C_3}\left\| \sigma _u^{n+1}\right\| ^2 +\frac{27}{2C_2C_5}\left\| \sigma _u^n\right\| ^2 +\frac{27}{8C_2C_4}\left\| \sigma _u^{n-1}\right\| ^2\\&\quad +\frac{5C_2C_3}{6} \left\| \sigma _w^{n+1}\right\| ^2 +\frac{54C_5}{C_2}\left\| \sigma _w^n\right\| ^2 +\frac{27C_4}{2C_2}\left\| \sigma _w^{n-1}\right\| ^2\\&\quad +Ch^{2(q+1)}\left( \left\| w^{n+1}\right\| _{1+q}^2 +\left\| w^n\right\| _{1+q}^2+\left\| w^{n-1}\right\| _{1+q}^2\right) \\&\quad +C\tau ^3\int _{t_n}^{t_{n+1}}\left\| \nabla \partial _{tt} u\right\| ^2 ~\text{ d }t+Ch^{2q}\left( \left\| u^n\right\| _{1+q}^2 +\left\| u^{n-1}\right\| _{1+q}^2\right) . \end{aligned} \end{aligned}$$
(53)

Substituting estimates (46)–(53) into the error Eq. (44), we obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{2\tau } \left( \left\| \mathbf p ^{n+1}\right\| _\mathbf G ^2 -\left\| \mathbf p ^n\right\| _\mathbf G ^2\right) +\frac{\varepsilon ^2}{2}\left\| \sigma _w^{n+1}\right\| ^2+\frac{A\tau }{2} \left( \left\| \sigma _w^{n+1}\right\| ^2-\left\| \sigma _w^n\right\| ^2\right) \\&\quad \le \left( C_1+\frac{C_2}{4C_3}\right) \left\| \sigma _u^{n+1}\right\| ^2+\frac{27}{2C_2C_5} \left\| \sigma _u^n\right\| ^2+\frac{27}{8C_2C_4}\left\| \sigma _u^{n-1}\right\| ^2\\&\qquad +\frac{7C_2C_3}{6}\left\| \sigma _w^{n+1} \right\| ^2+\frac{54C_5}{C_2}\left\| \sigma _w^n\right\| ^2 +\frac{27C_4}{2C_2}\left\| \sigma _w^{n-1}\right\| ^2\\&\qquad +C\left( \tau h^{2(q+1)}+h^{2(q+1)}\right) \Vert w\Vert _{L^\infty {(0,T;H^{q+1})}}^2 +C\frac{h^{2(q+1)}}{\tau }\int _{t_{n-1}}^{t_{n+1}} \left\| \partial _t u\right\| _{q+1}^2~\text{ d }t\\&\qquad +C\tau ^3\int _{t_{n-1}}^{t_{n+1}} \left( \left\| \partial _{ttt}u\right\| ^2 +\left\| \nabla \partial _{tt}u\right\| ^2+\left\| \nabla \partial _t w\right\| ^2\right) ~\text{ d }t+Ch^{2q} \Vert u\Vert _{L^\infty {(0,T;H^{1+q})}}^2. \end{aligned} \end{aligned}$$

Summing up from \(n=1\) to \(n=m\) and multiplying by \(2\tau \) on both sides, we get

$$\begin{aligned} \begin{aligned}&\left\| \mathbf p ^{m+1}\right\| _\mathbf G ^2 -\left\| \mathbf p ^1\right\| _\mathbf G ^2 +2\tau \varepsilon ^2\sum _{n=1}^m \left\| \sigma _w^{n+1}\right\| ^2 +A\tau ^2\left( \left\| \sigma _w^{m+1}\right\| ^2 -\left\| \sigma _w^1\right\| ^2\right) \\&\quad \le \left( 2C_1+\frac{C_2}{2C_3}\right) \tau \left\| \sigma _u^{m+1}\right\| ^2 +\left( 2C_1+\frac{C_2}{2C_3} +\frac{27}{C_2C_5}+\frac{27}{4C_2C_4}\right) \tau \sum _{n=1}^{m-1}\left\| \sigma _u^{n+1}\right\| ^2\\&\qquad +\left( \frac{7C_2C_3}{3}+\frac{108C_5}{C_2} +\frac{27C_4}{C_2}\right) \tau \sum _{n=1}^m \left\| \sigma _w^{n+1}\right\| ^2 +\left( \frac{27}{C_2C_5}+\frac{27}{4C_2C_4}\right) \tau \left\| \sigma _u^1\right\| ^2\\&\qquad +\left( \frac{108C_5}{C_2}+\frac{27C_4}{C_2}\right) \tau \left\| \sigma _w^1\right\| ^2+C_{\varepsilon ,T}\left( \tau h^{2(q+1)}+h^{2(q+1)}+h^{2q}+\tau ^4\right) . \end{aligned} \end{aligned}$$

It is easy to verify that \(\Vert \mathbf p ^{m+1}\Vert _\mathbf G ^2\ge \frac{1}{2}\Vert \sigma _u^{m+1}\Vert ^2\), \(\Vert \mathbf p ^1\Vert _\mathbf G ^2=\frac{5}{2}\Vert \sigma _u^1\Vert ^2\). Taking \(C_1=\frac{C_2^2}{\varepsilon ^2}\), \(C_3=\frac{\varepsilon ^2}{4C_2}\), \(C_4 = \frac{C_2\varepsilon ^2}{108}\), \(C_5=\frac{C_2\varepsilon ^2}{432}\), we have

$$\begin{aligned} \begin{aligned}&\frac{\varepsilon ^2-8C_2^2\tau }{2\varepsilon ^2}\left\| \sigma _u^{m+1}\right\| ^2 +\frac{11\tau \varepsilon ^2}{12}\sum _{n=1}^m \left\| \sigma _w^{n+1}\right\| ^2\\&\quad \le C_{\varepsilon ,T}\left( h^{2q}+\tau ^4\right) +\left( 4C_1+\frac{27}{C_2C_5}+\frac{27}{4C_2C_4}\right) \tau \sum _{n=1}^{m-1}\left\| \sigma _u^{n+1}\right\| ^2\\&\qquad +\left( \frac{5}{2}+\frac{27\tau }{C_2C_5} +\frac{27\tau }{4C_2C_4}\right) \left\| \sigma _u^1\right\| ^2+\left( A\tau ^2+\frac{\tau \varepsilon ^2}{2} \right) \left\| \sigma _w^1\right\| ^2. \end{aligned} \end{aligned}$$
(54)

In order to estimate \(\Vert \sigma _w^1\Vert \) and \(\Vert \sigma _u^1\Vert \), we take \(v_h = \sigma _u^1, \psi _h=\varepsilon ^2\sigma _w^1\) in (42)–(43) and add up:

$$\begin{aligned} \begin{aligned}&\frac{1}{\tau }\left\| \sigma _u^1\right\| ^2 +\varepsilon ^2\left\| \sigma _w^1\right\| ^2 + A^{(0)}\left\| \nabla \sigma _u^1\right\| ^2\\&\quad = -\left( \delta _\tau ^1 \rho _u,\sigma _u^1\right) + \left( {\mathcal {R}}_1^1,\sigma _u^1\right) +A^{(0)}\left( {\mathcal {R}}_2^1, \nabla \sigma _u^1\right) +\left( {\mathcal {N}}^1,\nabla \sigma _u^1\right) -\varepsilon ^2\left( \rho _w^1,\sigma _w^1\right) . \end{aligned} \end{aligned}$$
(55)

Similar estimates of the right-hand side terms could be obtained as when \(n \ge 1\). For the first two terms, we use Taylor expansion and the Cauchy–Schwarz inequality:

$$\begin{aligned} -\left( \delta _\tau ^1 \rho _u,\sigma _u^1\right)&\le \frac{\tilde{C_1}}{3\tau } \Vert \sigma _u^1\Vert ^2+Ch^{2(q+1)}\int _{0}^{\tau } \Vert \partial _t u\Vert _{q+1}^2~\text{ d }t.\\ \left( {\mathcal {R}}_1^1,\sigma _u^1\right)&= \left( -\frac{1}{\tau }\int _{0}^{\tau } t~\partial _{tt} u ~\text{ d }t,\sigma _u^1\right) \\&\le \frac{\tilde{C_1}}{3\tau }\Vert \sigma _u^1\Vert ^2+C\tau ^2\int _{0}^{\tau } \Vert \partial _{tt} u\Vert ^2~\text{ d }t \\&\le \frac{\tilde{C_1}}{3\tau }\Vert \sigma _u^1\Vert ^2 +C\tau ^3\Vert u\Vert _{W^{2,\infty }(0,T;L^2)}^2. \end{aligned}$$

For the third term we make use of integration by parts, the regularity of the exact solution and the Cauchy–Schwarz inequality:

$$\begin{aligned} A^{(0)}\left( {\mathcal {R}}_2^1, \nabla \sigma _u^1\right)&= A^{(0)}\left( \nabla \left( u^1-u^0\right) , \nabla \sigma _u^1\right) = -A^{(0)}\left( \varDelta \left( u^1-u^0\right) , \sigma _u^1\right) \\&\le 2A^{(0)}\tau \Vert u\Vert _{W^{1,\infty } (0,T;H^2)}\Vert \sigma _u^1\Vert \le \frac{\tilde{C_1}}{3\tau }\Vert \sigma _u^1\Vert ^2 + C\tau ^3. \end{aligned}$$

As for the nonlinear term, applying Lemma 3 and the technique used in the above inequality yields:

$$\begin{aligned} \begin{aligned} \left( {\mathcal {N}}^1,\nabla \sigma _u^1\right)&= \left( \frac{\nabla u^1}{1+|\nabla u^1|^2} - \frac{\nabla u^0}{1+|\nabla u^0|^2}, \nabla \sigma _u^1 \right) + \left( \frac{\nabla u^0}{1+|\nabla u^0|^2} - \frac{\nabla u_h^0}{1+|\nabla u_h^0|^2}, \nabla \sigma _u^1 \right) \\&\le -\left( \nabla \cdot \left( \frac{\nabla u^1}{1+|\nabla u^1|^2} - \frac{\nabla u^0}{1+|\nabla u^0|^2}\right) , \sigma _u^1 \right) + \Vert \nabla \rho _u^0\Vert \Vert \nabla \sigma _u^1\Vert \\&\le C\tau \Vert \sigma _u^1\Vert + Ch^q\Vert u_0\Vert _{q+1}\Vert \nabla \sigma _u^1\Vert \\&\le \frac{\tilde{C_2}}{\tau }\Vert \sigma _u^1\Vert ^2+ \frac{C\tau ^3}{\tilde{C_2}} + \frac{Ch^{2q}}{\tilde{C_3}}\Vert u_0\Vert _{q+1}^2 + \tilde{C_3}\Vert \nabla \sigma _u^1\Vert ^2. \end{aligned} \end{aligned}$$
(56)

For the last term we use the Cauchy–Schwarz inequality again:

$$\begin{aligned} -\varepsilon ^2\left( \rho _w^1,\sigma _w^1\right)&\le \tilde{C_4}\varepsilon ^2\Vert \sigma _w^1\Vert ^2+Ch^{2(q+1)}\Vert w^1\Vert _{q+1}^2. \end{aligned}$$

Substituting the above estimates into (55), taking \(\tilde{C_1} = \tilde{C_2} =\tilde{C_4} =\frac{1}{4},~\tilde{C_3}=\frac{A^{(0)}}{2}\) and multiplying by \(\tau \) on both sides of (55) yields

$$\begin{aligned} \frac{1}{2}\Vert \sigma _u^1\Vert ^2 +\frac{3\tau \varepsilon ^2}{4}\Vert \sigma _w^1\Vert ^2 + \frac{A^{(0)}\tau }{2}\Vert \nabla \sigma _u^1\Vert ^2 \le C_{\varepsilon }\left( \tau h^{2(q+1)} + \tau ^{4} +\tau h^{2q}\right) . \end{aligned}$$
(57)

Substituting (57) into (54), we get

$$\begin{aligned} \begin{aligned}&\frac{\varepsilon ^2-2\alpha ^2\tau }{2\varepsilon ^2} \left\| \sigma _u^{m+1}\right\| ^2 +\frac{\tau \varepsilon ^2}{4}\sum _{n=1}^m \left\| \sigma _w^{n+1}\right\| ^2\\&\quad \le C_{\varepsilon ,T}\left( h^{2q}+\tau ^4\right) +\left( 4C_1+\frac{27}{C_2C_5} +\frac{27}{4C_2C_4}\right) \tau \sum _{n=1}^{m-1} \left\| \sigma _u^{n+1}\right\| ^2. \end{aligned} \end{aligned}$$
(58)

Applying the Gronwall inequality (Lemma 1), we obtain

$$\begin{aligned} \frac{\varepsilon ^2-2\alpha ^2\tau }{2\varepsilon ^2}\left\| \sigma _u^{m+1}\right\| +\left( \frac{\tau \varepsilon ^2}{4}\sum _{n=1}^m \left\| \sigma _w^{n+1}\right\| ^2\right) ^\frac{1}{2} \le C_{\varepsilon ,T}\left( h^q+\tau ^2\right) . \end{aligned}$$

A combination of the above estimate for \(\Vert \sigma _u^{m+1}\Vert \) and \(\Vert \sigma _w^{n+1}\Vert \) with (35) yields the conclusion we need. \(\square \)

3 Optimal Convergence Analysis

The error estimate for the proposed numerical scheme in the previous section has indicated an \(h^q\) spatial convergence order, and a \((q+1)\)th convergence order has not been theoretically available because of the difficulty in analysing the nonlinear term, while the numerical results shown in Table 1 indicate that the scheme has a \((q+1)\)th convergence order. In this section, for rectangular domains aligned with x–y axis, by using \({\mathcal {Q}}_q\) finite elements on rectangular meshes, this gap between the numerical results and the theoretical analysis can be overcome.

Recall the notations introduced in Sect. 2: Given a regular rectangular mesh \({\mathcal {T}}_h = \{K\}\) on a rectangular domain \(\varOmega \subset {\mathbb {R}}^2\) aligned with x–y axis, of which \(\{a_i\}\) and \(\{l_j\}\) denote the element vertices and edges. Set shape function space \({\mathcal {P}}={\mathcal {Q}}_q=span\{x^i y^j: ~0\le i,j\le q\}\), and we define the finite function space \(X_h\equiv \{v\in X\cap C^0(\varOmega )\mid v|_K\in {\mathcal {Q}}_q(K),\forall ~K\in {\mathcal {T}}_h\}\subset X \) with \(X=\{v\in H^q(\varOmega )\mid ~(v,1)=0\}\).

Now we introduce the interpolation operator \(i_h^q: C^0(\bar{\varOmega })\rightarrow X_h\) defined in [9, p.108]:

$$\begin{aligned} i_h^q w(a_i)&= w(a_i), \end{aligned}$$
(59)
$$\begin{aligned} \int _{l_j} \left( i_h^q w - w\right) v ~\text{ d }s&= 0, \quad \forall ~v|_{l_j}\in {\mathcal {P}}_{q-2}, \end{aligned}$$
(60)
$$\begin{aligned} \int _{K} \left( i_h^q w - w\right) v ~\text{ d }{} \mathbf x&= 0, \quad \forall ~v|_{K}\in {\mathcal {Q}}_{q-2}, \end{aligned}$$
(61)

which, according to [9, p.108], satisfies [9, p.101, Lemma A.4]:

$$\begin{aligned} |i_h^q w-w|_{s,l,\varOmega } \le Ch^{q+1-s}|w|_{q+1,l,\varOmega }, ~0\le s \le q+1, ~1< l<\infty . \end{aligned}$$
(62)

Using the results in [18] the following convergence property for \(i_h^q\) can be obtained:

Lemma 4

Assume that a(x) is Lipschitz continuous in \(\varOmega \), for any \(v \in X_h\) and \(w\in H^{q+2}(\varOmega )\), \(q\ge 1\), we have

$$\begin{aligned} \left| (a(x,y) \partial _y\left( i_h^q w-w\right) , \partial _y v)\right| + \left| (a(x,y) \partial _x\left( i_h^q w-w\right) , \partial _x v)\right| \le Ch^{q+1}\Vert w\Vert _{q+2}|v|_1. \end{aligned}$$
(63)

And,

$$\begin{aligned} \left| \left( a(x,y) \partial _x\left( i_h^q w-w\right) , \partial _y v\right) \right| + \left| \left( a(x,y) \partial _y\left( i_h^q w-w\right) , \partial _x v\right) \right| \le Ch^{q+1}\Vert w\Vert _{q+2}|v|_1. \end{aligned}$$
(64)

where \(w=0 \text { on } \partial \varOmega \).

Proof

Since (64) comes directly from [18, p. 341, Lemma 3(I)], here we only consider the proof of (63), which follows the proof of [18, Lemma 3(I)].

Firstly we separate \((a(x,y) \partial _x(i_h^q w-w), \partial _x v)\) into two parts:

$$\begin{aligned} \begin{aligned}&\left( a(x,y) \partial _x\left( i_h^q w-w\right) , \partial _x v\right) \\&\quad =\sum _K \int _K a(x,y) \partial _x\left( i_h^q w-w\right) \partial _x v ~\text{ d }{} \mathbf x \\&\quad = \sum _K \int _K a(x_K,y_K) \partial _x \left( i_h^q w-w\right) \partial _x v ~\text{ d }{} \mathbf x \\&\qquad + \sum _K \int _K \left( a(x,y) - a(x_K,y_K)\right) \partial _x\left( i_h^q w-w\right) \partial _x v ~\text{ d }{} \mathbf x \\&\quad \le \sum _K \int _K a(x_K,y_K) \partial _x\left( i_h^q w-w\right) \partial _x v ~\text{ d }{} \mathbf x + Ch\cdot h^q \Vert w\Vert _{q+1}|v|_1 \\&\quad \equiv I_a + Ch^{q+1} \Vert w\Vert _{q+1}|v|_1, \end{aligned} \end{aligned}$$
(65)

in which \((x_K,y_K)\) denotes the center of element K, and the continuity of a(xy) has been used. To analyse \(I_a\), we need to make use of Lemma 1(I) and Lemma 2(I) of [18]:

$$\begin{aligned} \int _K \partial _x\left( i_h^q w-w\right) \partial _x v ~\text{ d }{} \mathbf x&= O\left( h^{q+1}\right) |w|_{k+2,K}|v|_{1,K}, \quad \forall ~v\in X_h, ~q \ge 2, \end{aligned}$$
(66)
$$\begin{aligned} \int _K \partial _x\left( i_h^q w-w\right) \partial _x v ~\text{ d }{} \mathbf x&= O(h^{2})|w|_{3,K}|v|_{1,K}, \quad \forall ~v\in X_h, ~q =1. \end{aligned}$$
(67)

Since all the boundary integral terms appeared in the proof of (66) and (67), i.e. [18, equation (21),(29),(37),(48),(52),(55),(58)], are equal to 0, thus to estimate \(I_a\) we simply need to sum up the result for each \(K\in {\mathcal {T}}_h\):

$$\begin{aligned} I_a&= \sum _K \int _K a(x_K,y_K) \partial _x\left( i_h^q w-w\right) \partial _x v ~\text{ d }{} \mathbf x \\&\le Ch^{q+1}\sum _K |w|_{k+2,K}|v|_{1,K} \le Ch^{q+1}\Vert w\Vert _{q+2}|v|_1. \end{aligned}$$

Substituting the above result into (65) yields the conclusion. \(\square \)

Referring to Theorem 2.2.3 and Theorem 2.3.3 in [28, p. 29 and p. 47], for w without boundary restrictions, an estimate of the same integral as in (64) is available:

Lemma 5

Assume that \(a(x)\in W^{1,\infty }(\varOmega )\), for any \(v \in X_h\) and \(w\in H^{q+2}(\varOmega )\), \(q\ge 1\), we have

$$\begin{aligned} \left| \left( a(x,y) \partial _x\left( i_h^q w-w\right) , \partial _y v\right) \right| \le Ch^{q+\frac{1}{2}}\Vert w\Vert _{q+2}|v|_1. \end{aligned}$$
(68)

Below we denote by \(u^n\equiv u(t_n)\) and \(u_h^n\) the value of the exact solution and the numerical solution at \(t_n\), respectively. Define \(\eta _u^n \equiv u^n-i_h^q u^n\). We hope to obtain the same error order as (64) without the Dirichlet boundary restriction. Here we consider a special case when \(a(x,y)=0\) on \(\partial \varOmega \), which is the case we shall encounter in the following treatment of the nonlinear term. Referring to the proof of (64) in [18] and the proof of (68) in [28], and with a careful treatment for the boundary term, we are able to derive the desired result.

Lemma 6

Given function \(a(x,y)\in W^{1,\infty }(\varOmega )\) with \(a(x,y)=0\) on \(\partial \varOmega \), for any \(v \in X_h\) and \(w\in H^{q+2}(\varOmega )\), \(q\ge 1\), we have

$$\begin{aligned} \left| \left( a(x,y) \partial _x\left( i_h^q w-w\right) , \partial _y v\right) \right| \le Ch^{q+1}\Vert w\Vert _{q+2}|v|_1. \end{aligned}$$
(69)

Proof

As in [28], on each rectangular element we define \(\bar{a}=\int _K \frac{a}{|K|}~\text{ d }{} \mathbf x \), where |K| is the area of K, then the original integral can thus be separated into two parts. Applying \(|\bar{a}-a|\le Ch|a|_{1,\infty }\) and (62) yields

$$\begin{aligned} \begin{aligned} \left( a(x,y)\partial _x\eta _w^n,\partial _y v\right)&= -\sum _K \left( (\bar{a}-a)\partial _x\eta _w^n,\partial _y v\right) +\sum _K \left( \bar{a}\partial _x\eta _w^n,\partial _y v\right) \\&\le Ch^{q+1}|a|_{1,\infty }\Vert w\Vert _{q+1}\Vert \nabla v\Vert + \sum _K \left( \bar{a}\partial _x\eta _w^n,\partial _y v\right) . \end{aligned} \end{aligned}$$
(70)

The second part on the right-hand side can be analysed in the same way as in the proof of (64) in [18]. We separate the Taylor expansion of \(\partial _y v\) on the middle point of K into three parts, and analyse them respectively. Only boundary terms require Dirichlet boundary condition of w to achieve the \(O(h^{q+1})\) error order:

$$\begin{aligned} \begin{aligned} \sum _K \left( \bar{a}\partial _x\eta _w^n,\partial _y v\right)&\le Ch^{q+1}\Vert w\Vert _{q+2}\Vert \nabla v\Vert \\&\quad +\left( \sum _{\partial K\cap \partial \varOmega _4\ne \emptyset }\int _{l_4} -\sum _{\partial K\cap \partial \varOmega _3\ne \emptyset }\int _{l_3}\right) \left( \bar{a}w_{x^{q+1}}E^q (x)v_{x^q}\right) ~\text{ d }{} \mathbf x . \end{aligned} \end{aligned}$$
(71)

Here \(l_3, ~l_4\) denote the lower and upper boundary of K, respectively. Similarly \(\partial \varOmega _3, ~\partial \varOmega _4\) represent respectively the lower and upper boundary of \(\varOmega \). The auxiliary function \(E^q (x)=\frac{[(x-x_K)^2-h_K^2]^q}{2^q}\) is \(O(h^{2q})\), in which \(x_K\) is the x-coordinate of the middle point of element K, \(h_K\) is half of the width of element K in the x-direction. And also, \(w_{x^{q+1}}\equiv \frac{\partial ^{q+1}w}{\partial x^{q+1}}\). In [28], for w without boundary conditions, applying trace theorem

$$\begin{aligned} \Vert \gamma w\Vert _{q+1,2,\partial K}\Vert \gamma v\Vert _{q,2,\partial K}\le \Vert w\Vert _{q+\frac{3}{2},2,K}\Vert v\Vert _{q+\frac{1}{2},2,K}, \end{aligned}$$

inverse estimate

$$\begin{aligned} \Vert v\Vert _{q+\frac{1}{2},2,K} \le h^{\frac{1}{2}-q}\Vert v\Vert _{1,2,K} \end{aligned}$$

and Poincaré’s inequality yields

$$\begin{aligned} \begin{aligned} \left( \sum _{\partial K\cap \partial \varOmega _4\ne \emptyset }\int _{l_4} -\sum _{\partial K\cap \partial \varOmega _3\ne \emptyset }\int _{l_3}\right) \left( \bar{a}w_{x^{q+1}}E^q (x)v_{x^q}\right) ~\text{ d }{} \mathbf x \le Ch^{q+\frac{1}{2}}\Vert w\Vert _{q+2}\Vert \nabla v\Vert . \end{aligned} \end{aligned}$$

However, with a careful application of the condition that \(a=0\) on \(\partial \varOmega \) and \(a\in W^{1,\infty }\), we have that: on boundary rectangular elements, \(|\bar{a}| = O(h)\). Specifically, without loss of generality, we assume edge \(l\in \partial K \cap \partial \varOmega \) is parallel to x-axis and denote the y-coordinate on l by \(y_l\), then

$$\begin{aligned} |\bar{a}| = \left| \int _K \frac{a}{|K|}~\text{ d }{} \mathbf x \right| = \left| \int _K \frac{\int _{y_l}^y \partial _y a~\text{ d }s}{|K|}~\text{ d }{} \mathbf x \right| \le \left| |a|_{1,\infty }\int _K \frac{\int _{y_l}^y \text{ d }s}{|K|}~\text{ d }{} \mathbf x \right| \le Ch |a|_{1,\infty }. \end{aligned}$$

Thus

$$\begin{aligned} \begin{aligned} \left( \sum _{\partial K\cap \partial \varOmega _4\ne \emptyset }\int _{l_4} -\sum _{\partial K\cap \partial \varOmega _3\ne \emptyset }\int _{l_3}\right) \left( \bar{a}w_{x^{q+1}}E^q (x)v_{x^q}\right) ~\text{ d }{} \mathbf x \le Ch^{q+\frac{3}{2}}|a|_{1,\infty }\Vert w\Vert _{q+2}\Vert \nabla v\Vert . \end{aligned} \end{aligned}$$

Substituting this estimate into (71), we thus obtain \(O(h^{q+1})\) estimate. \(\square \)

Since our exact solutions to Eqs. (6) and (7) satisfy Neumann boundary condition \(\partial _n w=\partial _n u = 0\) on \(\partial \varOmega \) and \(\varOmega \) is aligned with x–y axis, for \(a(x,y)=\frac{\partial _x u^n\partial _y u^n}{(1+|\nabla u^n|^2)^2}\) we have \(a|_{\partial \varOmega }=0\). Thus Lemma 6 leads to the the following conclusion needed in the proof of Theorem 4.

Corollary 1

Let \(\varOmega \) be a rectangular domain aligned with x–y axis. Given function \(a(x,y)=\frac{\partial _x u^n\partial _y u^n}{(1+|\nabla u^n|^2)^2}\), with \(u\in L^{\infty }(0,T;W^{2,\infty })\) and \(\partial _n u= 0\) on \(\partial \varOmega \), for any \(v \in X_h\) and \(w\in H^{q+2}(\varOmega )\), \(q\ge 1\), we have

$$\begin{aligned} \left| \left( a(x,y) \partial _x\left( i_h^q w-w\right) , \partial _y v\right) \right| \le Ch^{q+1}\Vert w\Vert _{q+2}|v|_1. \end{aligned}$$
(72)

Here we introduce the standard Lagrange interpolation operator defined in [2, p. 77, (3.3.2)]: For the finite element \((\varOmega , {\mathcal {Q}}_q, {\mathcal {N}})\), the basis \(\{\phi _i\}\) dual to \({\mathcal {N}}\), and \(\{N_i\}\in {\mathcal {N}}\), denote \(I_h:C^0(\bar{\varOmega })\rightarrow Q_q(K)\) the standard interpolation \(I_h v \equiv \sum _{i=1}^{k} N_i(v)\phi _i\). Then (62) and (63) in turn yield:

Lemma 7

For any \(w\in H^{q+2}(\varOmega )\cap W^{q+1,\infty }(\varOmega )\), we have

$$\begin{aligned} \left\| \nabla \left( i_h^q w-w\right) \right\| _{L^{\infty }} \le Ch^{q-1}|w|_{q+1,\infty },\quad \left\| \nabla \left( i_h^q w-R_h w\right) \right\| \le Ch^{q+1}|w|_{q+2}, \end{aligned}$$

where C is a constant independent of w.

Proof

Assume that the maximum element diameter h of mesh \(T_h\) is sufficiently small. It is well-known that interpolant \(I_h\) has the error estimate \(|I_h w-w|_{i,p}\le Ch^{m-i}|w|_{m,p}\) for any \(1\le p\le \infty \) and \(m-l-n/p > 0\); see [2, p. 105, Theorem 4.4.4]. Also, from the inverse estimate Lemma 2 we obtain that \(\Vert v\Vert _{1,\infty }\le Ch^{-2}\Vert v\Vert \). Thus we get

$$\begin{aligned} \left\| \nabla \left( i_h^q w-w\right) \right\| _{L^{\infty }}&\le \left\| \nabla \left( i_h^q w-I_h w\right) \right\| _{L^{\infty }} +\left\| \nabla (I_h w-w)\right\| _{L^{\infty }}\\&\le Ch^{-1}|i_h^q w-I_h w|_1+Ch^q |w|_{q+1,\infty } \\&\le Ch^{-1}\left( |i_h^q w-w|_1+|w-I_h w|_1\right) +Ch^q |w|_{q+1,\infty }\\&\le Ch^{q-1} |w|_{q+1}+Ch^q |w|_{q+1,\infty }\\&\le Ch^{q-1}\left( |w|_{q+1}+|w|_{q+1,\infty }\right) . \end{aligned}$$

Since \(\varOmega \) is bounded, we can bound \(|w|_{q+1}\) by \(|w|_{q+1,\infty }\), thus the first inequality has been proved. As for the second inequality, we make use of the definition of \(R_h\):

$$\begin{aligned} \left\| \nabla \left( i_h^q w-R_h w\right) \right\| ^2&= \left( \nabla \left( i_h^q w-w\right) , \nabla \left( i_h^q w-R_h w\right) \right) \\&\le Ch^{q+1}|w|_{q+2}|i_h^q w-R_h w|_1. \end{aligned}$$

Thus we obtain \(\Vert \nabla (i_h^q w-R_h w)\Vert \le Ch^{q+1}|w|_{q+2}\). \(\square \)

Now we state the main conclusion of this section.

Theorem 4

Given a rectangular domain \(\varOmega \) that is aligned with x–y axis, and a regular rectangular mesh \(T_h\) on \(\varOmega \), define finite function space \(X_h\) as in the beginning of this section and interpolation operator \(i_h^q\) as in [9]. Assume that the exact solution pair (uw) and time step size \(\tau \) satisfy the assumptions in Theorem 3. In addition, assume that \(u\in L^\infty (0,T;H^{q+2})\cap L^{\infty }(0,T;W^{2,\infty })\), then we have the following error estimate for scheme (19) and (20):

$$\begin{aligned} \left\| u^n-u_h^n\right\| + \left( \tau \sum _{m=1}^n \left\| w^m-w^m_h\right\| ^2\right) ^{\frac{1}{2}} \le C_{\varepsilon ,T}\left( h^{q+1}+\tau ^2\right) . \end{aligned}$$
(73)

Proof

For simplicity, below we denote \(f(|\nabla u|^2)\equiv \frac{1}{1+|\nabla u|^2}\). Recalling the analysis in Sect. 2.3, we only need to improve the analysis of the nonlinear terms \(({\mathcal {N}}^1,\nabla \sigma _u^1)\) and \(({\mathcal {N}}^{n+1},\nabla \sigma _u^{n+1})\). Firstly, separate \(({\mathcal {N}}^1,\nabla \sigma _u^1)\) into two parts as in (56):

$$\begin{aligned} \left( {\mathcal {N}}^1,\nabla \sigma _u^1\right)&\equiv \left( \frac{\nabla u^1}{1+|\nabla u^1|^2}-\frac{\nabla u^0}{1+|\nabla u^0|^2},\nabla \sigma _u^1\right) +\left( \frac{\nabla u^0}{1+|\nabla u^0|^2}-\frac{\nabla u_h^0}{1+|\nabla u_h^0|^2},\nabla \sigma _u^1\right) \\&= -\left( \nabla \cdot \left( \frac{\nabla u^1}{1+|\nabla u^1|^2}-\frac{\nabla u^0}{1+|\nabla u^0|^2}\right) , \sigma _u^1\right) +\left( \tilde{{\mathcal {N}}}^0,\nabla \sigma _u^1\right) \\&\le C\tau ^3 + \frac{C}{\tau }\Vert \sigma _u^1\Vert ^2 + \left( \tilde{{\mathcal {N}}}^{0},\nabla \sigma _u^1\right) , \end{aligned}$$

where \(\tilde{{\mathcal {N}}}^{0} \equiv \frac{\nabla u^0}{1+|\nabla u^0|^2}-\frac{\nabla u_h^0}{1+|\nabla u_h^0|^2}\). Here we consider the general form of \((\tilde{{\mathcal {N}}}^{0},\nabla \sigma _u^1)\):

$$\begin{aligned} \left( \tilde{{\mathcal {N}}}^n,\nabla \sigma _u^{n+1}\right)&\equiv \left( \frac{\nabla u^n}{1+|\nabla u^n|^2}-\frac{\nabla u_h^n}{1+|\nabla u_h^n|^2},\nabla \sigma _u^{n+1}\right) \\&= \left( f\left( |\nabla u^n|^2\right) \nabla e_u^n,\nabla \sigma _u^{n+1}\right) +\left( \left[ f\left( |\nabla u^n|^2\right) -f\left( |\nabla u_h^n|^2\right) \right] \nabla u_h^n,\nabla \sigma _u^{n+1}\right) \\&= \left( f\left( |\nabla u^n|^2\right) \nabla e_u^n,\nabla \sigma _u^{n+1}\right) - \left( \left[ f\left( |\nabla u^n|^2\right) -f\left( |\nabla u_h^n|^2\right) \right] \nabla e_u^n,\nabla \sigma _u^{n+1}\right) \\&\quad +\left( \left[ f\left( |\nabla u^n|^2\right) -f\left( |\nabla u_h^n|^2\right) \right] \nabla u^n,\nabla \sigma _u^{n+1}\right) \\&\equiv ({\mathrm{III}})+ ({\mathrm{IV}})+({\mathrm{V}}). \end{aligned}$$

Separating (3) into three parts, applying (63) and Lemma 7, we have

$$\begin{aligned} ({\mathrm{III}})&= \left( f\left( |\nabla u^n|^2\right) \nabla \eta _u^n,\nabla \sigma _u^{n+1}\right) +\left( f\left( |\nabla u^n|^2\right) \nabla \left( i_h^q-R_h\right) u^n,\nabla \sigma _u^{n+1}\right) \\&\qquad +\left( f\left( |\nabla u^n|^2\right) \nabla \sigma _u^n,\nabla \sigma _u^{n+1}\right) \\&\le Ch^{q+1}|u^n|_{q+2}\left\| \nabla \sigma _u^{n+1}\right\| + Ch^{q+1}|u^n|_{q+2}\left\| \nabla \sigma _u^{n+1}\right\| +\left\| \nabla \sigma _u^{n+1}\right\| \left\| \nabla \sigma _u^n\right\| \\&\le Ch^{2(q+1)}+C\left\| \nabla \sigma _u^{n+1}\right\| ^2 +C\left\| \nabla \sigma _u^n\right\| ^2. \end{aligned}$$

Similarly, we separate the \(\nabla e_u^n\) term in \(({\mathrm{IV}})\) into three parts

$$\begin{aligned} ({\mathrm{IV}})&= \left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla e_u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla \eta _u^n,\nabla \sigma _u^{n+1}\right) \\&\qquad -\left( \left[ f\left( |\nabla u^n|^2\right) -f\left( |\nabla u_h^n|^2\right) \right] \nabla \left( i_h^q-R_h\right) u^n,\nabla \sigma _u^{n+1}\right) \\&\qquad -\left( \left[ f\left( |\nabla u^n|^2\right) -f\left( |\nabla u_h^n|^2\right) \right] \nabla \sigma _u^n,\nabla \sigma _u^{n+1}\right) \\&\le \left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla e_u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla \eta _u^n,\nabla \sigma _u^{n+1}\right) +2\left\| \nabla \sigma _u^{n+1}\right\| \left\| \nabla \sigma _u^n\right\| \\&\qquad +2\left\| \nabla \left( i_h^q-R_h\right) u^n\right\| \left\| \nabla \sigma _u^{n+1}\right\| . \end{aligned}$$

Again, we separate the first term on the right-hand side into three parts. A combination of the inequality \(\frac{|a|+|b|}{(1+a^2)(1+b^2)}<1\) and (62) yields:

$$\begin{aligned} \begin{aligned} ({\mathrm{IV}})&\le \left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla \eta _u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla \eta _u^n,\nabla \sigma _u^{n+1}\right) \\&\qquad +\left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla \left( i_h^q-R_h\right) u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla \eta _u^n,\nabla \sigma _u^{n+1}\right) \\&\qquad +\left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla \sigma _u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla \eta _u^n,\nabla \sigma _u^{n+1}\right) \\&\qquad +2\left\| \nabla \sigma _u^{n+1}\right\| \left\| \nabla \sigma _u^n\right\| +2 \left\| \nabla \left( i_h^q-R_h\right) u^n\right\| \left\| \nabla \sigma _u^{n+1}\right\| \\&\le \left\| \nabla \eta _u^n\right\| _{L^4}^2\left\| \nabla \sigma _u^{n+1}\right\| +\left( 2+ \left\| \nabla \eta _u^n\right\| _{L^\infty }\right) \left\| \nabla \left( i_h^q-R_h\right) u^n\right\| \left\| \nabla \sigma _u^{n+1}\right\| \\&\qquad +\left( \left\| \nabla \eta _u^n\right\| _{L^\infty }+2\right) \left\| \nabla \sigma _u^{n+1}\right\| \left\| \nabla \sigma _u^n\right\| \\&\le C\left( h^{4q}+h^{2(q+1)}\right) +C\left( 2+h^{q-1}\right) \left\| \nabla \sigma _u^{n+1}\right\| ^2 + C\left\| \nabla \sigma _u^n\right\| ^2\\&\le C\left( h^{4q}+h^{2(q+1)}\right) +C\left\| \nabla \sigma _u^{n+1}\right\| ^2 + C\left\| \nabla \sigma _u^n\right\| ^2, \end{aligned} \end{aligned}$$
(74)

in which we have made use of (62) and 2-D Sobolev embedding (from \(H^{q+2}\) into \(W^{q+1,4}\)) in the third step:

$$\begin{aligned} \Vert \nabla \eta _{u}^n\Vert _{L^4}^2 \le C h^{2(q+1-1)}|u|_{q+1,4} \le C h^{2q}|u|_{q+2,2}. \end{aligned}$$

In addition, we have used the restriction \(h<1\), raised in Sect. 2.3, in the derivation of (74). Since \(4q\ge 2(q+1)\) for \(q\ge 1\), we have obtained \(O(h^{q+1})\) spatial convergence for \(({\mathrm{IV}})\).

As for term \(({\mathrm{V}})\), we separate it into three parts:

$$\begin{aligned} ({\mathrm{V}})&= -\left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla \eta _u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla u^n,\nabla \sigma _u^{n+1}\right) \\&\quad -\left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla \left( i_h^q-R_h\right) u^n}{ \left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla u^n,\nabla \sigma _u^{n+1}\right) \\&\quad -\left( \frac{\nabla \left( u^n+u_h^n\right) \cdot \nabla \sigma _u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla u^n,\nabla \sigma _u^{n+1}\right) \\&\equiv ({\mathrm{V}}_1)+({\mathrm{V}}_2)+({\mathrm{V}}_3). \end{aligned}$$

The last two terms can be analysed similarly:

$$\begin{aligned} ({\mathrm{V}}_2)&\le \Vert \nabla u^n\Vert _{L^\infty }\Vert \nabla \left( i_h^q-R_h\right) u^n\Vert \Vert \nabla \sigma _u^{n+1}\Vert \le C\Vert \nabla \sigma _u^{n+1}\Vert ^2+Ch^{2(q+1)},\\ ({\mathrm{V}}_3)&\le \Vert \nabla u^n\Vert _{L^\infty }\Vert \nabla \sigma _u^{n+1}\Vert \Vert \nabla \sigma _u^n\Vert \le C\Vert \nabla \sigma _u^{n+1}\Vert ^2+C\Vert \nabla \sigma _u^n\Vert ^2. \end{aligned}$$

In order to apply (63) and Corollary 1 to \(({\mathrm{V}}_1)\), we separate the term twice so as to obtain a continuous coefficient function as a(xy) in Corollary 1.

$$\begin{aligned} ({\mathrm{V}}_1)&= -\left( \frac{2\nabla u^n\cdot \nabla \eta _u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla u^n,\nabla \sigma _u^{n+1}\right) \\&\quad +\left( \frac{\nabla e_u^n\cdot \nabla \eta _u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla u^n,\nabla \sigma _u^{n+1}\right) \\&=-\left( \frac{2\nabla u^n\cdot \nabla \eta _u^n}{\left( 1+|\nabla u^n|^2\right) ^2}\nabla u^n,\nabla \sigma _u^{n+1}\right) \\&\quad -\left( \frac{2\left( \nabla u^n\cdot \nabla \eta _u^n\right) \left( \nabla \left( u^n+u_h^n\right) \cdot \nabla e_u^n\right) }{\left( 1+|\nabla u^n|^2\right) ^2\left( 1+|\nabla u_h^n|^2\right) }\nabla u^n,\nabla \sigma _u^{n+1}\right) \\&\quad +\left( \frac{\nabla e_u^n\cdot \nabla \eta _u^n}{\left( 1+|\nabla u^n|^2\right) \left( 1+|\nabla u_h^n|^2\right) }\nabla u^n,\nabla \sigma _u^{n+1}\right) . \end{aligned}$$

Note that the last two terms above can be analysed in the same way as (74), for brevity we only present the results here. Also, for the convenience of later analysis, we write the first term component wise:

$$\begin{aligned} ({\mathrm{V}}_1)&\le \left( a_{12}(x,y)\partial _x\eta _u^n,\partial _y\sigma _u^{n+1}\right) +\left( a_{12}(x,y)\partial _y\eta _u^n,\partial _x\sigma _u^{n+1}\right) +\left( a_1(x,y)\partial _x\eta _u^n,\partial _x\sigma _u^{n+1}\right) \\&\quad +\left( a_2(x,y)\partial _y\eta _u^n,\partial _y\sigma _u^{n+1}\right) +C\left( h^{4q}+h^{2(q+1)}\right) +C\left\| \nabla \sigma _u^{n+1}\right\| ^2+C\left\| \nabla \sigma _u^n\right\| ^2. \end{aligned}$$

Here \(a_1(x,y)=\frac{\partial _x u^n\partial _x u^n}{(1+|\nabla u^n|^2)^2}\), \(a_2(x,y)=\frac{\partial _y u^n\partial _y u^n}{(1+|\nabla u^n|^2)^2}\), and \(a_{12}(x,y)=\frac{\partial _x u^n\partial _y u^n}{(1+|\nabla u^n|^2)^2}\). Notice that \(a_{12}(x,y)\) is exactly the coefficient function in Corollary 1. Therefore, applying Corollary 1 to the first two terms and (63) to the third and fourth term leads to

$$\begin{aligned} ({\mathrm{V}}_1)\le C\left( h^{2(q+1)}+h^{4q}\right) +C\left\| \nabla \sigma _u^{n+1}\right\| ^2+C\left\| \nabla \sigma _u^n\right\| ^2. \end{aligned}$$

Therefore, we have proved that the nonlinear term in the initial step (17) has \((q+1)\)th order spatial convergence.

As for the nonlinear term \(({\mathcal {N}}^{n+1},\nabla \sigma _u^{n+1})\) with \(n\ge 1\), we rewrite the term as follows:

$$\begin{aligned}&\left( {\mathcal {N}}^{n+1},\nabla \sigma _u^{n+1}\right) \\&\quad = \left( \frac{\nabla u^{n+1}}{1+|\nabla u^{n+1}|^2}-\frac{\nabla \left( 2u_h^n-u_h^{n-1}\right) }{1+|\nabla \left( 2u_h^n-u_h^{n-1}\right) |^2}\right) ,\nabla \sigma _u^{n+1})\\&\quad = \left( f\left( \left| \nabla u^{n+1}\right| ^2\right) \nabla u^{n+1}-f\left( \left| \nabla \left( 2u^n-u^{n-1}\right) \right| ^2\right) \nabla \left( 2u^n-u^{n-1}\right) ,\nabla \sigma _u^{n+1}\right) \\&\qquad +\Big (f\Big (\Big |\nabla \Big (2u^n-u^{n-1}\Big )\Big |^2\Big ) \nabla \Big (2u^n-u^{n-1}\Big )\\&\qquad -f\Big (\Big |\nabla \Big (2u_h^n-u_h^{n-1}\Big )\Big |^2\Big ) \nabla \Big (2u_h^n-u_h^{n-1}\Big ),\nabla \sigma _u^{n+1}\Big )\\&\quad \equiv ({\mathrm{VI}})+({\mathrm{VII}}). \end{aligned}$$

Notice that \(({\mathrm{VI}})\) can be estimated using Lemma 3:

$$\begin{aligned} ({\mathrm{VI}})&\le \left\| \nabla \left( u^{n+1}-2u^n+u^{n-1}\right) \right\| \left\| \nabla \sigma _u^{n+1}\right\| \\&\le C\tau ^3\int _{t_{n-1}}^{t_{n+1}} \left\| \nabla \partial _{tt} u\right\| ^2~\text{ d }t + C\left\| \nabla \sigma _u^{n+1}\right\| ^2. \end{aligned}$$

And \(({\mathrm{VII}})\) can be analysed as \((\tilde{{\mathcal {N}}}^n,\nabla \sigma _u^{n+1})\) and be bounded as

$$\begin{aligned} ({\mathrm{VII}})\le C\left( \left\| \nabla \sigma _u^{n+1}\right\| ^2 +\left\| \nabla \sigma _u^n\right\| ^2+\left\| \nabla \sigma _u^{n-1}\right\| ^2\right) +C\left( h^{2(q+1)}+h^{4q}\right) . \end{aligned}$$

Therefore, the nonlinear terms in the two schemes both have \((q+1)\)th order spatial convergence when \(q\ge 1\). \(\square \)

Remark 3

In this section we have obtained the optimal convergence analysis for scheme (17)–(20) on rectangular mesh, which is based on the previous works by [18, 28]. One should notice that (1) plays a crucial role in obtaining the optimal convergence order. On the other hand, the convergence analysis for the case when \(T_h\) is a triangulation is still ongoing. Recently, Yan et al. has obtained the optimal error estimates for a class of linear fourth-order elliptic problems in [30], using a super-closeness relation between the numerical solution and the Ritz projection of the exact solution. And we hope to extend the results to the MBE equations.

4 Numerical Results

4.1 Convergence Test

In this subsection we present some numerical tests to check the theoretical convergence of the proposed scheme (19)–(20). Firstly, we set \(\varOmega = [0,1]^2\), \(T=1\) and \(\varepsilon ^2 = 0.05\). The exact solution is given by

$$\begin{aligned} u_e (x,y,t) = \cos ( \pi x) \cos ( \pi y) \mathrm{e}^{-t}. \end{aligned}$$
(75)

Next, in order to satisfy the PDE (4) and boundary conditions (2), we add an artificial, time-dependent forcing term on the right hand side:

$$\begin{aligned}&\partial _tu_e+\nabla \cdot \left( \frac{\nabla u_e}{1+{|\nabla u_e|}^2}\right) -\varepsilon ^2\varDelta w_e=g, \quad (x,y,t)\in \varOmega \times (0,T],\nonumber \\&\quad \text{ with } \, \, \, g = \left( -1+4\pi ^4\varepsilon ^2\right) u -\frac{4\pi ^2u}{2+\pi ^2e^{-2t} [1-\cos (2\pi x)\cos (2\pi y)]} \nonumber \\&\quad +\frac{4\pi ^4e^{-2t}u}{[1-\cos (2\pi x)\cos (2\pi y)]^2} [\cos (2\pi x) + \cos (2\pi y) - 2\cos (2\pi x)\cos (2\pi y)] . \end{aligned}$$
(76)

In the building of finite elements, we use both \({\mathcal {P}}_1\) and \({\mathcal {P}}_2\) elements on uniform meshes with grid size \(h=1/32, 1/64, 1/128, 1/256\). The \(L^2\)-norm of errors \(\Vert u_e-u_h\Vert \) are recorded at \(T=1\). As for the time step, we firstly set \(\tau = h^2\), so that the spatial error counts the most part in \(\Vert u_e-u_h\Vert \) and we can see that both kinds of elements achieve full order spatial convergence. Table 1 shows the \(L^2\) error and convergence order for our proposed scheme under this condition. As we can see from the table, second and third order accuracies are observed for the spatial approximations with \({\mathcal {P}}_1\) and \({\mathcal {P}}_2\) elements, respectively. Next we set \(\tau = h/2\), and the results are showed in Table 2, from which we can observe a clear second order convergence for the temporal approximation.

Table 1 Scheme convergence with triangular mesh

On the other hand, we test the \(Q_1\) finite element methods introduced in Sect. 3. Parameters and exact solutions are the same as above, except for \(\varepsilon ^2 = 0.01\). Mesh sizes are selected to test both the uniform and quasi-uniform cases. Error norm and convergence order are shown in Table 2. The (q+1)-order spatial convergence is clearly observed.

Table 2 Outputs of Q1 MFEM

Note that the \(H^1\) semi-norms \(\Vert \nabla (u(T)-u_h^N)\Vert \) and \(\Vert \nabla (w(T)-w_h^N)\Vert \) lie between \(O(h^q)\) and \(O(h^{q+1})\). To find which term has caused this super-convergence, we run the same \(Q_1\) mixed finite element (\((w_{n+1},\psi _h) - (\nabla u_{n+1},\psi _h) = 0\)) on two linear parabolic systems:

$$\begin{aligned} \text {Test 1}&: \left( \frac{3u_{n+1}-4u_n+u_{n-1}}{2\tau },v_h\right) + \varepsilon ^2 (\nabla w_{n+1},v_h) = (g_{n+1},v_h);\\ \text {Test 2}&: \left( \frac{3u_{n+1}-4u_n+u_{n-1}}{2\tau },v_h\right) + \varepsilon ^2 (\nabla w_{n+1},v_h)\\&\qquad \qquad \qquad +A\tau (\nabla (w_{n+1}-w_n),v_h)= (g_{n+1},v_h). \end{aligned}$$

Firstly we run the experiments with \(\varepsilon ^2 = 0.01\). The results of \(\Vert u(T)-u_h^N\Vert \), \(\Vert w(T)-w_h^N\Vert \), \(\Vert \nabla (u(T)-u_h^N)\Vert \) and \(\Vert \nabla (w(T)-w_h^N)\Vert \) on uniform and quasi-uniform meshes are shown in Figs. 12, and Table 3. For the case with \(\varepsilon ^2 = 0.05\), the results are shown in Table 4. It is observed that the “super-convergence” phenomenon of \(\Vert \nabla (u(T)-u_h^N)\Vert \) and \(\Vert \nabla (w(T)-w_h^N)\Vert \) in Test 2 vanishes in Table 4, when \(A\tau = \frac{25}{16}\tau < 0.05 = \varepsilon ^2\). Therefore, the stabilization term doesn’t lead to a better convergence order. Instead, the difference in convergence order between Test 1 and Test 2 in Table 3 may have been caused by the stabilization error, which is no longer dominant when \(A\tau < \varepsilon ^2\).

Fig. 1
figure 1

Convergence results on uniform rectangular mesh with \(h = \tau = [\frac{1}{16},\frac{1}{32},\frac{1}{64},\frac{1}{128}]\)

Fig. 2
figure 2

Convergence results on quasi-uniform rectangular mesh with \(2h_x = h_y = \tau = [\frac{1}{16},\frac{1}{32},\frac{1}{64},\frac{1}{128}]\)

Table 3 Convergence order of Test 1 and Test 2 on rectangular mesh with \(\varepsilon ^2 = 0.01\)
Table 4 Convergence order of Test 1 and Test 2 on rectangular mesh \(\varepsilon ^2 = 0.05\)

4.2 Energy Decay Simulation

In this subsection we aim at simulating the energy decay process of u in the PDE (4). Recall the energy functional (1) proposed at the beginning:

$$\begin{aligned} E(u) \equiv \int _{\varOmega }\left( - \frac{1}{2} \ln (1+|\nabla u|^2) +\frac{\varepsilon ^2}{2} |\varDelta u|^2\right) \text{ d }{} \mathbf x . \end{aligned}$$
(77)

We set the surface diffusion coefficient parameter as \(\varepsilon ^2=0.005\), and the computational domain is taken to be \(\varOmega = (0,12.8)^2\) with time interval [0, 20,000]. For \(\mathbf x \in \varOmega \), let \(u(\mathbf x ,0)\) have random value between \((-\,0.05,0.05)\). For the spatial discretization, we use a resolution of \(N=256\), and as for the time step size, we set \(\tau =0.004\) (\(t<200\)), \(\tau =0.04\) (\(1000>t\ge 200\)), \(\tau = 0.08\) (\(2000>t\ge 1000\)), \(\tau = 0.16\) (\(t\ge 2000\)). When time step size changes, we reset the last solution as the initial solution and invoke the initialization scheme again with the new time step size.

Snapshots of the solution at \(t = 1,~500,~5000,~12{,}000\) are shown in Fig. 3. In Fig. 4 we show the energy evolution of the scheme (19)–(20) in time interval [0, 12, 250].

Fig. 3
figure 3

Snapshots of u at indicated time

Fig. 4
figure 4

Plot of the temporal evolution of E for \(\varepsilon ^2 = 0.005\) using \(P_1\) elements

As is shown in Fig. 4, energy decays at a much faster speed at the beginning. In order to obtain the energy decay rate, below we provide the semi-log plot of the temporal evolution of E in [1, 400] using both \(P_1\) and \(P_2\) elements (Figs. 56).

Fig. 5
figure 5

Semi-log plot of the temporal evolution of E for \(\varepsilon ^2 = 0.005\) using \(P_1\) elements. The blue line represents the energy obtained by the numerical simulation, while the dashed red line is a least square approximation to the energy data. The fitted line has the form \(a\ln (t)+b\), with \(a = -40.59\), \(b = -47.94\) (Color figure online)

Fig. 6
figure 6

Semi-log plot of the temporal evolution of E for \(\varepsilon ^2 = 0.005\) using \(P_2\) elements. The blue line represents the energy obtained by the numerical simulation, while the dashed red line is a least square approximation to the energy data. The fitted line has the form \(a\ln (t)+b\), with \(a = -39.7\), \(b = -51.5\) (Color figure online)

Fig. 7
figure 7

The log–log plot of the temporal evolution of the average height of u with \(\varepsilon ^2 = 0.005\) using \(P_1\) elements. The blue line represents the data obtained by the numerical simulation, while the dashed red line is a least square approximation to the height data. The fitted line has the form \(at^b\), with \(a = 0.336\), \(b = 0.5341\) (Color figure online)

Fig. 8
figure 8

The log–log plot of the temporal evolution of the average slope of u with \(\varepsilon ^2 = 0.005\) using \(P_1\) elements. The blue line represents the data obtained by the numerical simulation, while the dashed red line is a least square approximation to the slope data. The fitted line has the form \(at^b\), with \(a = 2.348\), \(b = 0.2526\) (Color figure online)

4.3 Other Physical Quantities

In this subsection we look at two more physically interesting quantities, i.e., the characteristic height h(t) and the average slope m(t), of which the expressions are as below:

$$\begin{aligned} h(t)&= \sqrt{\frac{1}{|\varOmega |}\int _\varOmega |u(\mathbf x ,t)-\bar{u}(t)|^2 ~\text{ d }{} \mathbf x }, \quad \text{ with } \, \, \bar{u}(t):=\frac{1}{|\varOmega |}\int _\varOmega u(\mathbf x ,t)~\text{ d }{} \mathbf x .\\ m(t)&=\sqrt{\frac{1}{|\varOmega |}\int _\varOmega |\nabla u(\mathbf x ,t)|^2 ~\text{ d }{} \mathbf x }. \end{aligned}$$

For the no-slope-selection model (4), one could obtain \(h\sim O\left( t^{1/2}\right) \), \(m(t) \sim O\left( t^{1/4}\right) \), and \(E\sim O\left( -\ln (t)\right) \) as \(t\rightarrow \infty \). (See [5, 10, 15, 16] and references therein.) This implies that the characteristic (average) length \(\ell (t) := h(t)/m(t) \sim O\left( t^{1/4}\right) \) as \(t\rightarrow \infty \), so that the average length and average slope scale the same with increasing time. Of course, the average mound height h(t) grows faster than the average length \(\ell (t)\), which is expected because there is no preferred slope of the height function.

At a theoretical level, the detailed analyses in [12, 13, 16] have indicated (at best) lower bounds for the energy dissipation and, conversely, upper bounds for the average height. On the other hand, the rates quoted as the upper or lower bounds are typically observed for the averaged values of the quantities of interest. To adequately capture the full range of coarsening behaviors, numerical simulations for the coarsening process require short- and long-time accuracy and stability, in addition to high spatial accuracy for small values of \(\varepsilon \).

Under the same mesh settings as in Sect. 4.2, here we provide the log–log plot for h(t) and m(t) in time interval [1, 19, 839]. In fact, these two quantities can be easily measured experimentally. Rigorously, the lower bound for the energy decay rate is of the order of \(- \ln (t)\), the upper bounds for the average height and average slope/average length are of the order of \(t^{1/2}\), \(t^{1/4}\), respectively. Figures 7 and 8 present the log–log plots for the average height versus time, and average slope versus time, respectively. The detailed scaling “exponents” are obtained using least squares fits of the computed data up to \(t=400\). A clear observation of the \(t^{1/2}\) and \(t^{1/4}\) scaling laws can be made, with different coefficients dependent upon \(\varepsilon \), or, equivalently, the domain size, L.

5 Concluding Remarks

In this paper we have presented a second order accurate, linear energy stable numerical scheme for a thin film model without slope selection, with a mixed finite element approximation in space. The unconditional unique solvability and unconditional long time energy stability have been justified at a theoretical level. And also, we obtain an \(O(h^{q}+\tau ^2)\)-order convergence analysis in the \(\ell ^\infty (0,T; H^2)\) norm, when \(\tau \) is sufficiently small and \(h<1\). In addition, using regular rectangular mesh, we can improve the spatial convergence to \((q+1)\)th order under current Neumann boundary condition assumptions. Furthermore, the numerical experiments showed that the proposed second-order scheme is able to produce accurate long time numerical results with a reasonable computational cost.