1 Introduction

Since the pioneering work of Xu [27, 28], two-grid methods have received a lot of attention and successfully applied to approximate various (linear and) nonlinear problems (see, for example, [2, 9, 14, 15, 19, 20, 22, 29]). The two-grid scheme is based on passing information between finite element equations defined on two grids of different mesh sizes. In the first step, a nonlinear problem itself is solved in a coarse space, i.e., finite dimensional space with coarse grids. In the second step, the nonlinear problem is linearized locally at the solution obtained in the coarse space. Then, the linearized problem is solved in a fine space. For better performance, this process can be iterated on a sequence of linearized problems with increasing dimensions.

In this paper, we propose multi-level algorithms for a class of nonlinear problems using a two-grid idea. For error analysis, we use the abstract framework of Brezzi, Rappaz, and Raviart (BRR) [4] and the approach proposed by Caloz and Rappaz [7]. The framework was designed for approximation of branches of nonsingular solutions for a class of nonlinear problem. As an application of our algorithm, we consider the pseudostress-velocity formulation of the stationary Navier–Stokes equations (NSE). It should be noted that our algorithm can be applied to other problems such as semilinear elliptic problems (cf. [20]).

The BRR theory applied to the velocity-pressure formulation of the NSE can be found in [13]. The theory was applied to study mixed approximation of scalar elliptic problems with gradient nonlinearities in [18]. The pseudostress-velocity mixed formulation (see [1, 5, 8, 11]) of stationary NSE was presented in [6]. There, it is shown that for sufficiently small mesh size, the discrete problem has a branch of nonsingular solutions in a neighborhood of the solution of the continuous problem and provides an error bound \({\mathcal O}(h^{k+1-\frac{d}{6}})\) in the \(L^3(\Omega )^{d\times d}\times L^3(\Omega )^d\) norms, where the Raviart–Thomas index \(k\ge 0\) and d is the spatial dimension. Recently, in [21], we obtained optimal a priori error estimates \({{\mathcal {O}}}(h^{k+1})\) in the \(L^3(\Omega )^{d\times d}\times L^6(\Omega )^d\) norms, thus improving the result [6]. This is made possible by applying Petrov–Galerkin type BRR theory [7] rather than the standard BRR theory [13].

In [20], we presented a unified error analysis of two-grid algorithm for a class of nonlinear problems by recasting the two-grid method of Xu [27, 28] in the framework of BRR. Moreover, the error bounds \({{\mathcal {O}}}(h^{k+1}+H^{2k+2})\), \({{\mathcal {O}}}(h^{k+1}+H^{3k+3})\) and \({{\mathcal {O}}}(h^{k}+H^{5k})\) measured in several norms of interest are obtained. In this paper, we further develop this approach to efficiently compute numerical solutions by multilevel mesh refinement strategy. The two-grid algorithm is first applied, and in the subsequent process mesh refinement is exploited to deliver a desired accuracy and convergence; as the mesh is being refined, the solution on a given mesh is exploited as an accurate starting iterate for solutions on the next mesh level. For details, see Multi-level Algorithm in Sect. 2.

Optimal fine grid accuracy of two-grid/multi-level algorithms can be achieved via the proper scaling of relevant meshes under regular mesh refinements. An important aspect of the proposed algorithms is that for computational efficiency regular mesh refinement can be used in conjunction with Newton-type methods for system solution in contrast to the usual Newton’s method on a fixed mesh.

The remainder of this paper is organized as follows. In the next section, we introduce approximation of branches of nonsingular solutions based on the BRR theory. Before we introduce our algorithm, we investigate the standard Newton’s method for nonlinear algebraic problem. Quadratic convergence of Newton’s method is proved with the constant appearing in the estimate being independent of the mesh size h, while convergence in the existing result [10, 12, 24] deteriorates as h approaches 0. We develop two-grid/multi-level algorithms and analyze convergence of the algorithms. Section 3 is devoted to applying the algorithms to stationary NSE and analyzing the a priori error estimates in the \(L^3(\Omega )^{d\times d}\times L^6(\Omega )^d\) norm.

In Sect. 4, several numerical results are given to show the performance of the algorithm and to validate the theory developed in this paper. A simple semilinear elliptic problem is first considered and then the NSE. Concluding remarks are given in the last section.

2 Two-Grid/Multi-level Algorithms

In this section, we start with the abstract approximation theory of Brezzi, Rappaz, and Raviart for nonlinear problems developed in [7, 13]. Then, we present two-grid/multi-level algorithms to solve resulting nonlinear algebraic equations.

Let X, \({\mathscr {X}}\) and Y be real Banach spaces with the norms \(\Vert \cdot \Vert _X\), \(\Vert \cdot \Vert _{{\mathscr {X}}}\) and \(\Vert \cdot \Vert _Y\), respectively and assume that \(X\hookrightarrow {\mathscr {X}}\), continuously imbedding. Let \(\Lambda \subset {\mathbb {R}}\) be a compact interval and denote \({\mathscr {L}}(X;Y)\) by the set of all linear and continuous operators from X into Y. For given \({\mathcal {C}}^2\) map \(G:\Lambda \times X\rightarrow Y\), we consider the following nonlinear problem: Find \((\nu ,\phi )\in \Lambda \times X\) such that

$$\begin{aligned} F(\nu ,\phi ):=\phi +SG(\nu ,\phi )=0, \end{aligned}$$
(1)

where \(S\in {\mathscr {L}}(Y;{\mathscr {X}})\) is a linear operator independent of \(\nu \).

The set \(\{(\nu ,\phi (\nu )):\nu \in \Lambda \}\) is called a branch of solutions of (1) if \(F(\nu ,\phi (\nu ))=0\) and the map \(\nu \rightarrow \phi (\nu )\) is a continuous from \(\Lambda \) into X. If the Fréchet derivative \(D_\phi F(\nu ,\phi (\nu ))\) of F with respect to \(\phi \) is an isomorphism from X onto \({\mathscr {X}}\) for all \(\nu \in \Lambda \), then the branch of solutions \(\{(\nu ,\phi (\nu )):\nu \in \Lambda \}\) is called nonsingular.

To approximate the solution of the problem (1), we introduce an operator \(S_h\in {\mathscr {L}}(Y;{\mathscr {X}})\) intended to approximate the linear operator S, where \(h>0\) is a real parameter which will tend to zero. The approximation of nonlinear problem (1) is to find \(\phi ^h\in X\) such that

$$\begin{aligned} F_h(\nu ,\phi ^h):=\phi ^h+S_hG(\nu ,\phi ^h)=0. \end{aligned}$$
(2)

We assume that the following hypotheses.

  1. (H1)

    G is a \({\mathcal {C}}^2\)-mapping from \(\Lambda \times X\) into Y, and the second-order Fréchet derivatives of G are bounded on all bounded subsets of \(\Lambda \times X\).

  2. (H2)

    There exists another Banach space \(Z\hookrightarrow Y\), continuously imbedding, such that

    $$\begin{aligned}D_\phi G(\nu ,\phi )\in {\mathscr {L}}(X;Z)\quad \forall \nu \in \Lambda ,\quad \forall \phi \in X.\end{aligned}$$
  3. (H3)

    For all \(y\in Y\), \(\displaystyle \lim _{h\rightarrow 0} \Vert (S-S_h)y\Vert _{{\mathscr {X}}}=0.\)

  4. (H4)

    \(\displaystyle \lim _{h\rightarrow 0}\Vert S-S_h\Vert _{Z;{\mathscr {X}}}=0.\)

Under the hypotheses, existence of a unique branch of nonsingular solutions of problem (2), its a priori error estimate and its convergence property are stated below. We skip the proof since it is very similar to that of Theorem IV.3.3 elaborated in [13]. We introduce the following quantities:

$$\begin{aligned} \mu (\nu )&:=\Vert D_{\phi }F(\nu ,\phi (\nu ))^{-1}\Vert _{{\mathscr {X}};X}, \quad {\bar{\mu }}:=\displaystyle \sup _{\nu \in \Lambda }\mu (\nu ),\\ L(\psi _*(\nu ),\xi )&:=\sup _{\nu \in \Lambda , \psi \in B(\psi _*(\nu ),\xi )}\Vert D^2_\phi G(\nu ,\psi )\Vert _{X^2;Y}, \end{aligned}$$

where \(B(\psi _*(\nu ),\xi ):=\{\psi \in X; \Vert \psi _*(\nu )-\psi \Vert _X\le \xi \}\).

Theorem 2.1

Assume that (H1)–(H4) hold and that \(\{(\nu ,\phi (\nu )):\nu \in \Lambda \}\) is a branch of nonsingular solutions of (1). Then there exist \({\bar{\xi }}>0\) and \({\bar{h}}({\bar{\xi }})>0\) such that for all \(h\le {\bar{h}}({\bar{\xi }})\), there is a \({\mathcal {C}}^2\)-function \(\nu \in \Lambda \rightarrow \phi ^h(\nu )\in X\) satisfying that \(\{(\nu ,\phi ^h(\nu )):\nu \in \Lambda \}\) is a branch of nonsingular solutions of (2), \(\phi ^h(\nu )\) is the only solution of (2) in the ball \(B(\phi (\nu ),{\bar{\xi }})\) for all \(\nu \in \Lambda \), and for \(K_1:=4\displaystyle \mu (\nu )\) and a constant \(C_S>0\), the following estimates hold:

$$\begin{aligned} \Vert \phi (\nu )-\phi ^h(\nu )\Vert _X&\le K_1\Vert F_h(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}= K_1\Vert (S-S_h)G(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}, \quad \forall \nu \in \Lambda , \end{aligned}$$
(3)
$$\begin{aligned} \Vert D_{\phi }F_h(\nu ,\phi ^h(\nu ))^{-1}\Vert _{{\mathscr {X}};X}&\le 2\Vert D_{\phi }F_h(\nu ,\phi (\nu ))^{-1}\Vert _{{\mathscr {X}};X}\le 4\Vert D_{\phi }F(\nu ,\phi (\nu ))^{-1}\Vert _{{\mathscr {X}};X}\le K_1,\end{aligned}$$
(4)
$$\begin{aligned} \Vert S_h\Vert _{Z;{\mathscr {X}}}&\le C_S. \end{aligned}$$
(5)

Note that since \(\Lambda \) is compact, the function \(\mu (\nu )\) is bounded above on \(\Lambda \). Thus if needed, we can take \(K_1\) as the constant independent of not only h but also \(\nu \), i.e., \(K_1:=4{\bar{\mu }}.\)

Lemma 2.2

Assume that (H1)–(H4) hold and that \(D_{\phi }F(\nu ,{\tilde{\phi }})\) for \({\tilde{\phi }}\in X\) is an isomorphism from X onto \({\mathscr {X}}\). Then there exists \(\xi _*=\xi _*({\tilde{\phi }})>0\) such that if \(\psi \in B({\tilde{\phi }},\xi )\cap X\) for \(0<\xi \le \xi _*\), then \(D_{\phi }F(\nu ,\psi )\) is an isomorphism from X onto \({\mathscr {X}}\) and there exists \(h_*=h_*({\tilde{\phi }})>0\) such that if \(0<h\le h_*\), then \(D_{\phi }F_h(\nu ,{\tilde{\phi }})\) is an isomorphism from X onto \({\mathscr {X}}\) with the bound, respectively

$$\begin{aligned} \Vert D_{\phi }F(\nu ,\psi )^{-1}\Vert _{{\mathscr {X}};X}&\le 2 \Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}, \end{aligned}$$
(6)
$$\begin{aligned} \Vert D_{\phi }F_h(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}&\le 2 \Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}. \end{aligned}$$
(7)

Proof

Note that

$$\begin{aligned} D_{\phi }F(\nu ,{\tilde{\phi }})-D_{\phi }F(\nu ,\psi )=S(D_{\phi }G(\nu ,{\tilde{\phi }})-D_{\phi }G(\nu ,\psi )). \end{aligned}$$

So,

$$\begin{aligned}&\Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}[D_{\phi }F(\nu ,{\tilde{\phi }})-D_{\phi }F(\nu ,\psi )]\Vert _{X;X}\\&\quad \le \Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}\Vert S\Vert _{Y;{\mathscr {X}}}\Vert D_{\phi }G(\nu ,{\tilde{\phi }})-D_{\phi }G(\nu ,\psi )\Vert _{X;Y}. \end{aligned}$$

If \(\psi \in B({\tilde{\phi }},\xi )\cap X\), then by the mean-value theorem and (H1) we have

$$\begin{aligned} \Vert D_{\phi }G(\nu ,{\tilde{\phi }})-D_{\phi }G(\nu ,\psi )\Vert _{X;Y} \le L({\tilde{\phi }},\xi )\Vert {\tilde{\phi }}-\psi \Vert _X \le \xi L({\tilde{\phi }},\xi ). \end{aligned}$$

Since the function \(\xi \rightarrow L({\tilde{\phi }},\xi )\) is monotonically increasing, we can take sufficiently small \(\xi _*=\xi _*({\tilde{\phi }})>0\) (e.g., \(\xi _* L({\tilde{\phi }},\xi _*)\le 1/(2\Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}\Vert S\Vert _{Y;{\mathscr {X}}})\)) such that for \(\xi \) with \(0<\xi \le \xi _*\)

$$\begin{aligned} \Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}&[D_{\phi }F(\nu ,{\tilde{\phi }})-D_{\phi }F(\nu ,\psi )]\Vert _{X;X}\le 1/2 \end{aligned}$$

As a consequence,

$$\begin{aligned} D_{\phi }F(\nu ,\psi )= D_{\phi }F(\nu ,{\tilde{\phi }})(I-D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}[D_{\phi }F(\nu ,{\tilde{\phi }})-D_{\phi }F(\nu ,\psi )]) \end{aligned}$$

is an isomorphism from X onto \({\mathscr {X}}\) and

$$\begin{aligned} \Vert D_{\phi }F(\nu ,\psi )^{-1}\Vert _{{\mathscr {X}};X}&\le 2 \Vert D_\phi F(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}. \end{aligned}$$

Next, we are going to show (7). Together with the identity

$$\begin{aligned} D_{\phi }F(\nu ,{\tilde{\phi }})-D_{\phi }F_h(\nu ,{\tilde{\phi }})=(S-S_h)D_{\phi }G(\nu ,{\tilde{\phi }}), \end{aligned}$$

it follows from (H2) and (H4) that there exists \(h_*=h_*({\tilde{\phi }})>0\) such that for \(0<h\le h_*\),

$$\begin{aligned} \Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}&[D_{\phi }F(\nu ,{\tilde{\phi }})-D_{\phi }F_h(\nu ,{\tilde{\phi }})]\Vert _{X;X} \le 1/2. \end{aligned}$$

Considering

$$\begin{aligned} D_{\phi }F_h(\nu ,{\tilde{\phi }})=D_{\phi }F(\nu ,{\tilde{\phi }})(I-D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}[D_{\phi }F(\nu ,{\tilde{\phi }})-D_{\phi }F_h(\nu ,{\tilde{\phi }})]), \end{aligned}$$

we see that \(D_{\phi }F_h(\nu ,{\tilde{\phi }})\) is an isomorphism from X onto \({\mathscr {X}}\) and

$$\begin{aligned} \Vert D_{\phi }F_h(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}\le 2 \Vert D_{\phi }F(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}. \end{aligned}$$

\(\square \)

Remark 2.3

When F is replaced by \(F_h\) for h small enough in Lemma 2.2, the estimate (6) also holds. That is, if \(D_{\phi }F_h(\nu ,{\tilde{\phi }})\) for \({\tilde{\phi }}\in X\) is an isomorphism from X onto \({\mathscr {X}}\), then there exists \(\xi _*=\xi _*({\tilde{\phi }})>0\) such that for \(\psi \in B({\tilde{\phi }},\xi )\cap X\) with \(0<\xi \le \xi _*\), \(D_{\phi }F_h(\nu ,\psi )\) is an isomorphism from X onto \({\mathscr {X}}\) with the bound

$$\begin{aligned} \Vert D_{\phi }F_h(\nu ,\psi )^{-1}\Vert _{{\mathscr {X}};X}&\le 2\Vert D_{\phi }F_h(\nu ,{\tilde{\phi }})^{-1}\Vert _{{\mathscr {X}};X}. \end{aligned}$$
(8)

Before we introduce two-grid algorithm, we consider the standard Newton’s method for nonlinear algebraic problem (2): Choose an initial guess \(\phi ^{h}_{0}\in X\) and determine the sequence \((\phi ^{h}_{i})\subset X\) by

$$\begin{aligned} \phi ^{h}_{i}=\phi ^{h}_{i-1}-[D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})]^{-1}F_h(\nu ,\phi ^{h}_{i-1}),~~i=1, 2, \cdots . \end{aligned}$$
(9)

We investigate quadratic convergence of Newton’s method in the following theorem. Note that the constant \({\bar{K}}\) appearing in the estimate is independent of h, while convergence in the existing result [10, 12, 24] deteriorates with the mesh size h as h approaches 0.

Theorem 2.4

Assume that (H1)–(H4) hold. Let \({\bar{\xi }}\) and \({\bar{h}}\) be the constants given in Theorem 2.1 and for \(0<h\le {\bar{h}}\), let \(\phi ^h(\nu )\) be the only nonsingular solution of (2) in the ball \(B(\phi (\nu ),{\bar{\xi }})\). If \((\phi ^{h}_{i})\) is the sequence obtained from Newton’s method (9), then there exists \(\xi >0\) with \(\xi \le {\bar{\xi }}\) such that for each initial guess \(\phi ^{h}_{0}\) in \(B(\phi ^h(\nu ),\xi )\), Newton’s method determines a unique sequence \((\phi ^{h}_{i})\subset B(\phi ^h(\nu ),\xi )\) that converges to the solution \(\phi ^h(\nu )\). Furthermore, the convergence is quadratic: For a positive constant \({\bar{K}}\) independent of h, we have

$$\begin{aligned} \Vert \phi ^h(\nu )-\phi ^{h}_{i}\Vert _X\le {\bar{K}}\Vert \phi ^h(\nu )-\phi ^{h}_{i-1}\Vert ^2_X. \end{aligned}$$

Proof

For simplicity, \(\nu \) is dropped from the functions \(\phi ^h(\nu ),~\phi (\nu )\).

It follows from Remark 2.3 that there exists \(\xi _*=\xi _*(\phi ^h)\) such that for \(0<\xi \le \xi _*\), \(D_{\phi }F_h(\nu ,\psi )\) is an isomorphism from X onto \({\mathscr {X}}\) for all \(\psi \in B(\phi ^h,\xi )\) with the bound

$$\begin{aligned} \Vert D_{\phi }F_h(\nu ,\psi )^{-1}\Vert _{{\mathscr {X}};X}\le 2\Vert D_{\phi }F_h(\nu ,\phi ^{h})^{-1}\Vert _{{\mathscr {X}};X}\le 8{\bar{\mu }}. \end{aligned}$$
(10)

Note that the second inequality of the estimate (10) is derived from (4).

Now we prove that if \(\phi ^h_0\in B(\phi ^h,\xi )\) for a constant \(\xi \) with \(0<\xi \le \xi _*\) given above, then a sequence \((\phi ^{h}_{i})\) defined by (9) is in \(B(\phi ^h,\xi )\) and it converges to \(\phi ^h\). We proceed by induction on i: Suppose that \(\phi ^{h}_{i-1}\in B(\phi ^h,\xi )\), then \(D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})^{-1}\) exists and Newton’s method (9) yields

$$\begin{aligned} \phi ^{h}_{i}-\phi ^h&=\phi ^{h}_{i-1}-\phi ^h+D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})^{-1}(F_h(\nu ,\phi ^h)-F_h(\nu ,\phi ^{h}_{i-1}))\\&=D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})^{-1}\left[ F_h(\nu ,\phi ^h)-F_h(\nu ,\phi ^{h}_{i-1})-D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})(\phi ^h-\phi ^{h}_{i-1})\right] . \end{aligned}$$

The integral form of the mean value theorem

$$\begin{aligned} F_h(\nu ,\phi ^h)-F_h(\nu ,\phi ^{h}_{i-1})=\int _0^1 D_{\phi }F_h(\nu ,\phi ^{h}_{i-1}+t(\phi ^h-\phi ^{h}_{i-1}))(\phi ^h-\phi ^{h}_{i-1})\,dt \end{aligned}$$

leads to

$$\begin{aligned} \phi ^{h}_{i}-\phi ^h= & {} D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})^{-1}\int _0^1\left[ D_{\phi }F_h(\nu ,\phi ^{h}_{i-1}+t(\phi ^h-\phi ^{h}_{i-1}))-D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})\right] \\&\quad (\phi ^h-\phi ^{h}_{i-1})\,dt\\= & {} D_{\phi }F_h(\nu ,\phi ^{h}_{i-1})^{-1}\int _0^1S_h\left[ D_{\phi }G(\nu ,\phi ^{h}_{i-1}+t(\phi ^h-\phi ^{h}_{i-1}))-D_{\phi }G(\nu ,\phi ^{h}_{i-1})\right] \\&\quad (\phi ^h-\phi ^{h}_{i-1})\,dt. \end{aligned}$$

We take \(\xi :=\min \{{\bar{\xi }}, \xi _*, 1/(2{\bar{K}})\}\) for \({\bar{K}}:=8{\bar{\mu }}C_SL(\phi ,2{\bar{\xi }})\). Then it follows from (10), (5), the mean-value theorem and (H1) that

$$\begin{aligned} \Vert \phi ^h-\phi ^{h}_{i}\Vert _X&\le 8{\bar{\mu }}C_SL(\phi ^h,\xi )\Vert \phi ^h-\phi ^{h}_{i-1}\Vert _X^2. \end{aligned}$$

Since \(L(\phi ^h,\xi )\le L(\phi ,2{\bar{\xi }})\) by the inclusion \(B(\phi ^h,\xi )\subset B(\phi ,2{\bar{\xi }})\), we have

$$\begin{aligned} \Vert \phi ^h-\phi ^{h}_{i}\Vert _X&\le \frac{1}{2}\Vert \phi ^h-\phi ^{h}_{i-1}\Vert _X\le \frac{1}{2}\xi . \end{aligned}$$

Hence \(\phi ^{h}_{i}\) belongs to \(B(\phi ^h,\xi )\), the sequence \((\phi ^{h}_{i})\) converges to the solution \(\phi ^h\) of (2), and the convergence is quadratic. \(\square \)

Now, we introduce numerical algorithms to practically solve nonlinear problem (2). Assume that for each value of the parameter \(h>0\), the range of the operator \(S_h\) is a finite-dimensional subspace \(X_h\) of X.

Algorithm 1: Xu’s two-grid algorithm

      Step 1: Solve nonlinear system on coarse mesh

         Find \(\phi ^H(\nu )\in X_H\) such that \(F_H(\nu ,\phi ^H(\nu ))=0\).

      Step 2: Update on fine mesh with one Newton iteration

         Find \(\phi _h(\nu )\in X_h\) such that \(D_{\phi }F_h(\nu ,\phi ^H(\nu ))(\phi _h(\nu )-\phi ^H(\nu ))=-F_h(\nu ,\phi ^H(\nu ))\).

We will take \(H\le {\bar{h}}\) given in Theorem 2.1 as mesh size of coarse grid and h as mesh size of fine grid in applications. Nested relations, \(X_{H}\subset X_h\subset X\) are assumed to guarantee that \(\phi ^{H}(\nu ) \in X_h\). The invertibility of \(D_{\phi }F_h(\nu ,\phi ^H(\nu ))\) for \(h<H\le h^*\) small enough in Step 2 of Algorithm 1 comes from the assumption that \(D_{\phi }F(\nu ,\phi (\nu ))\) is an isomorphism from X onto \({\mathscr {X}}\).

The following theorem generalizes the theory covered in the paper [20], where \({\mathscr {X}} = X \) was considered. As it is well-known, Newton’s method shows quadratic convergence behaviors. Indeed, we see that our Algorithm 1 has quadratic convergence in the following sense:

Theorem 2.5

Assume that (H1)–(H4) hold. Let \(\phi ^h(\nu )\) and \(\phi ^H(\nu )\) be nonsingular solutions of (2) on \(X_h\) and \(X_H\), respectively. And let \(\phi _h(\nu )\) be the solution obtained from Step 2 of Algorithm 1. Then there exist \(\xi >0\) with \(\xi \le {\bar{\xi }}\) given in Theorem 2.1 and \(h^*>0\) such that for \(h<H\le h^*\), \(\phi ^h(\nu )\) and \(\phi ^H(\nu )\) belong to the ball \(B(\phi (\nu ),\xi /2)\) and \(\phi _h(\nu )\) belongs to the ball \(B(\phi ^h(\nu ),\xi /2)\). Moreover, for the positive constant \(K_2:=4{\bar{\mu }}C_SL({\bar{\xi }})\) independent of mesh sizes h, H and \(\nu \), we have

$$\begin{aligned} \Vert \phi ^h(\nu )-\phi _h(\nu )\Vert _X\le K_2 \Vert \phi ^h(\nu )-\phi ^H(\nu )\Vert _X^2, \end{aligned}$$
(11)

and for \(K_3:=\max \{K_1,2K_1^2K_2\}\), we have

$$\begin{aligned}&\Vert \phi (\nu )-\phi _h(\nu )\Vert _X\le K_3\big (\Vert (S-S_h)G(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}+\Vert (S-S_h)G(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}^2 \nonumber \\&\quad +\Vert (S-S_H)G(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}^2\big ). \end{aligned}$$
(12)

Proof

For simplicity, \(\nu \) is dropped from the functions \(\phi (\nu ),~\phi _h(\nu )\) and \(\phi ^H(\nu )\). It follows from Lemma 2.2 that there exist two values \(\xi _*=\xi _*(\phi )\) and \(h_*=h_*(\phi )\) such that for all \(\xi \) and h with \(0<\xi \le \xi _*\) and \(0<h\le h_*\), respectively, \(D_{\phi }F_h(\nu ,\phi ^H)\) is an isomorphism from X onto \({\mathscr {X}}\) for \(\phi ^H\in B(\phi ,\xi )\) with the bound

$$\begin{aligned} \Vert D_{\phi }F_h(\nu ,\phi ^H)^{-1}\Vert _{{\mathscr {X}};X}\le 2\Vert D_{\phi }F_h(\nu ,\phi )^{-1}\Vert _{{\mathscr {X}};X} \le 4\Vert D_{\phi }F(\nu ,\phi )^{-1}\Vert _{{\mathscr {X}};X}\le 4{\bar{\mu }}. \end{aligned}$$
(13)

Also, Theorem 2.1 asserts that for a given \(\xi \) with \(0<\xi \le {\bar{\xi }}\) there exists \(h'=h'(\xi /2)~(\le {\bar{h}})\) such that for \(h<H\le h'\), \(\phi ^h\in B(\phi ,\xi /2)\cap X_h\) and \(\phi ^H\in B(\phi ,\xi /2)\cap X_H\).

From Algorithm 1 and the integral form of the mean value theorem,

$$\begin{aligned} \phi _h-\phi ^h= & {} \phi ^H-\phi ^h+D_{\phi }F_h(\nu ,\phi ^H)^{-1}(F_h(\nu ,\phi ^h)-F_h(\nu ,\phi ^H))\\= & {} D_{\phi }F_h(\nu ,\phi ^H)^{-1}\left[ F_h(\nu ,\phi ^h)-F_h(\nu ,\phi ^H)-D_{\phi }F_h(\nu ,\phi ^H)(\phi ^h-\phi ^H)\right] \\= & {} D_{\phi }F_h(\nu ,\phi ^H)^{-1}\int _0^1\left[ D_{\phi }F_h(\nu ,\phi ^H+t(\phi ^h-\phi ^H))\right. \\&\quad \left. -D_{\phi }F_h(\nu ,\phi ^H)\right] (\phi ^h-\phi ^H)\,dt\\= & {} D_{\phi }F_h(\nu ,\phi ^H)^{-1}\int _0^1S_h\left[ D_{\phi }G(\nu ,\phi ^H+t(\phi ^h-\phi ^H))\right. \\&\quad \left. -D_{\phi }G(\nu ,\phi ^H)\right] (\phi ^h-\phi ^H)\,dt. \end{aligned}$$

We choose \(\xi :=\min \{{\bar{\xi }}, \xi _*, 1/(2K_2)\}\) for \(K_2=4{\bar{\mu }}C_SL(\phi , {\bar{\xi }})\). Then for \(h<H\le h^*:=\min \{h_*, h'(\xi /2)\}\), we have \(\phi ^h\in B(\phi ,\xi /2)\cap X_h\) and \(\phi ^H\in B(\phi ,\xi /2)\cap X_H\), and from (13), (5), the mean-value theorem and (H1), we get

$$\begin{aligned} \Vert \phi ^h-\phi _h\Vert _X&\le K_2\Vert \phi ^h-\phi ^H\Vert _X^2\le \frac{1}{2}\Vert \phi ^h-\phi ^H\Vert _X\le \frac{\xi }{2}. \end{aligned}$$

Hence \(\phi _h\) belongs to \(B(\phi ^h,\xi /2)\).

Finally, for the proof of the estimate (12), we use the triangle inequality

$$\begin{aligned} \Vert \phi -\phi _h\Vert _X\le \Vert \phi -\phi ^h\Vert _X+\Vert \phi ^h-\phi _h\Vert _X. \end{aligned}$$

Then, the result follows from Theorem 2.1 and (11). \(\square \)

Remark 2.6

Theorem 2.5 implies that convergence results in terms of the fine scale h is retained under proper relation between fine grid h and coarse grid H. Existing results [15] show that the relation is typically given by \(h={{{\mathcal {O}}}}(H^\alpha )\) for some \(\alpha \) with \(1<\alpha \le 2\). The theorem indicates the best choice \(\alpha =2\), i.e., \(h={{{\mathcal {O}}}}(H^2)\) is guaranteed as long as the discrete solution operator \(S_h\) is convergent with some positive convergence rate, that is, there is some positive \(\beta \) satisfying

$$\begin{aligned} \Vert (S-S_h)G(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}={\mathcal {O}}(h^\beta ). \end{aligned}$$

Now, we introduce multi-level version of the two-grid algorithm. For \(j\ge 0\), let \(h_j\le {\bar{h}}\) given in Theorem 2.1 be mesh sizes, \(X_j:=X_{h_j}\) finite element spaces such that \(X_j\subset X\), and \(F_j(\nu ,\phi ):=\phi +S_jG(\nu ,\phi )\) where an operator \(S_j:=S_{h_j}\) is intended to approximate the linear operator S. Nested relations, \(X_{j-1}\subset X_j\) (\(j\ge 1\)), are assumed to guarantee that \(\phi _{j-1}(\nu ) \in X_j\).

Algorithm 2: Multi-level Algorithm

      Step 1: Solve nonlinear system on initial mesh

         Find \(\phi _0(\nu )\in X_0\) such that \(F_0(\nu ,\phi _0(\nu ))=0\).

      Step 2: Update on each mesh level j with one Newton iteration

         For \(j = 1, 2, \cdots \), find \(\phi _j(\nu )\in X_j\) such that \(D_{\phi }F_j(\nu ,\phi _{j-1}(\nu ))(\phi _j(\nu )-\phi _{j-1}(\nu ))=-F_j(\nu ,\phi _{j-1}(\nu )).\)

Theorem 2.7

Assume that (H1)–(H4) hold. Let \(\phi ^{h_j}(\nu )\) be nonsingular solutions of (2) on \(X_j\). And let \(\phi _j(\nu )\) be solutions obtained from Step 2 of Algorithm 2. Then there exist \(\xi >0\) with \(\xi \le {\bar{\xi }}\) given in Theorem 2.1 and \(h^*>0\) such that for \(h_j\le h^*\), \(\phi ^{h_j}(\nu )\in B(\phi (\nu ),{\xi }/2)\) and \(\phi _j(\nu )\) belongs to the ball \(B(\phi ^{h_j}(\nu ),\xi /2)\) for \(j\ge 1\). Moreover, for the positive constant \(K_2=4{\bar{\mu }}C_SL({\bar{\xi }})\) independent of mesh sizes \(h_j\) and \(\nu \), we have the following quadratic relation

$$\begin{aligned} \Vert \phi ^{h_j}(\nu )-\phi _j(\nu )\Vert _X\le K_2 \Vert \phi ^{h_j}(\nu )-\phi _{j-1}(\nu )\Vert _X^2, \end{aligned}$$
(14)

and for \(K_4:=\max \{K_1,2K_1^2K_2, 2K_2\}\), we have an a priori estimate

$$\begin{aligned}&\Vert \phi (\nu )-\phi _j(\nu )\Vert _X\le K_4\big (\Vert (S-S_j)G(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}+\Vert (S-S_j)G(\nu ,\phi (\nu ))\Vert _{{\mathscr {X}}}^2\nonumber \\&\quad +\Vert \phi (\nu )-\phi _{j-1}(\nu )\Vert _X^2\big ). \end{aligned}$$
(15)

Proof

For simplicity, \(\nu \) is dropped from the functions \(\phi (\nu ),\) etc. It follows from Lemma 2.2 and Remark 2.3 that there exist two values \(\xi _*=\xi _*(\phi )\) and \(h_*=h_*(\phi )\) such that for all \(\xi \) and \(h_j\) with \(0<\xi \le \xi _*\) and \(0<h_j\le h_*\), respectively, \(D_\phi F_j(\nu ,\psi )\) is an isomorphism from X onto \({\mathscr {X}}\) for all \(\psi \in B(\phi ,\xi )\) with the bound

$$\begin{aligned} \Vert D_\phi F_j(\nu ,\psi )^{-1}\Vert _{{\mathscr {X}};X}\le 2\Vert D_{\phi }F_j(\nu ,\phi )^{-1}\Vert _{{\mathscr {X}};X} \le 4\Vert D_{\phi }F(\nu ,\phi )^{-1}\Vert _{{\mathscr {X}};X}\le 4{\bar{\mu }}. \end{aligned}$$
(16)

Also, Theorem 2.1 asserts that for a given \(\xi \) with \(0<\xi \le {\bar{\xi }}\) there exists \(h'=h'(\xi /2)~(\le {\bar{h}})\) such that for \(h_j\le h'\), the solution \(\phi ^{h_j}\) belongs to \(B(\phi ,\xi /2)\cap X_j\) for \(j=0,1,2,\cdots \).

Now we prove that for a constant \(\xi \) with \(0<\xi \le \xi _*\) given above, the solution \(\phi _j\,(j\ge 1)\) obtained from Step 2 of Algorithm 2 belongs to \(B(\phi ^{h_j},\xi /2).\) We proceed by induction on j: For \(j=1\) it holds by Theorem 2.5. Suppose that \(\phi _{j-1}\in B(\phi ^{h_{j-1}},\xi /2)\), then since \(B(\phi ^{h_{j-1}},\xi /2)\subset B(\phi ,\xi )\), \(D_{\phi }F_j(\nu ,\phi _{j-1})^{-1}\) exists. Algorithm 2 and the integral form of the mean value theorem yield

$$\begin{aligned} \phi _{j}-\phi ^{h_j}= & {} \phi _{j-1}-\phi ^{h_j}+D_{\phi }F_{j}(\nu ,\phi _{j-1})^{-1}[F_{j}(\nu ,\phi ^{h_j})-F_{j}(\nu ,\phi _{j-1})]\\= & {} D_{\phi }F_{j}(\nu ,\phi _{j-1})^{-1}\left[ F_{j}(\nu ,\phi ^{h_j})-F_{j}(\nu ,\phi _{j-1})-D_{\phi }F_{j}(\nu ,\phi _{j-1})(\phi ^{h_j}-\phi _{j-1})\right] \\= & {} D_{\phi }F_{j}(\nu ,\phi _{j-1})^{-1}\int _0^1\left[ D_{\phi }F_{j}(\nu ,\phi _{j-1}+t(\phi ^{h_j}-\phi _{j-1}))\right. \\&\quad \left. -D_{\phi }F_{j}(\nu ,\phi _{j-1})\right] (\phi ^{h_j}-\phi _{j-1})\,dt\\= & {} D_{\phi }F_{j}(\nu ,\phi _{j-1})^{-1}\int _0^1S_{j}\left[ D_{\phi }G(\nu ,\phi _{j-1}+t(\phi ^{h_j}-\phi _{j-1}))\right. \\&\quad \left. -D_{\phi }G(\nu ,\phi _{j-1})\right] (\phi ^{h_j}-\phi _{j-1})\,dt. \end{aligned}$$

We take \(\xi :=\min \{{\bar{\xi }}, \xi _*, 2/(9K_2)\}\) for \(K_2=4{\bar{\mu }}C_SL(\phi ,{\bar{\xi }})\). Then for \(h_j\le h_{j-1}\le h^*=\min \{h_*,h'(\xi /2)\}\), we have \(\phi ^{h_j}\in B(\phi ,{\xi }/2)\cap X_j\), \(\phi ^{h_{j-1}}\in B(\phi ,{\xi }/2)\cap X_{j-1}\) and from the estimates (16) and (5), the mean-value theorem and (H1), we get

$$\begin{aligned} \Vert \phi ^{h_j}-\phi _{j}\Vert _X&\le K_2 \Vert \phi ^{h_j}-\phi _{j-1}\Vert _X^2. \end{aligned}$$
(17)

Since \(\phi _{j-1}\in B(\phi ^{h_{j-1}},\xi /2)\) and \(\phi ^{h_j}, \phi ^{h_{j-1}}\in B(\phi ,{\xi }/2)\), we have

$$\begin{aligned} \Vert \phi ^{h_j}-\phi _{j-1}\Vert _X\le \frac{3}{2}\xi , \end{aligned}$$

which yields with (17) that

$$\begin{aligned} \Vert \phi ^{h_j}-\phi _{j}\Vert _X\le \frac{1}{3}\Vert \phi ^{h_j}-\phi _{j-1}\Vert _X\le \frac{1}{2}\xi . \end{aligned}$$

Hence, \(\phi _j\) is in the ball \(B(\phi ^{h_j},\xi /2)\).

Finally, the estimate (15) follows from (3) together with (14) and the triangle inequality

$$\begin{aligned} \Vert \phi -\phi _{j}\Vert _X\le \Vert \phi -\phi ^{h_j}\Vert _X+\Vert \phi ^{h_j}-\phi _{j}\Vert _X. \end{aligned}$$

\(\square \)

Remark 2.8

Theorem 2.7 implies that optimal mesh relation between two adjacent levels is given by \(h_j={{{\mathcal {O}}}}(h_{j-1}^2)\) as in the previous remark.

Remark 2.9

All the theories mentioned in this section hold true under the Lipschitz-continuity of \(D_\phi G\) instead of (H1); there exists a function \(\xi \rightarrow L(\xi ):\mathbb R_+\rightarrow {\mathbb {R}}_+\), locally bounded, such that for all \(\psi \in {B(\phi (\nu ),\xi )}\), the ball with center \(\phi (\nu )\) and radius \(\xi \), and for all \(\nu \in \Lambda \)

$$\begin{aligned} \Vert D_\phi G(\nu ,\phi (\nu ))-D_\phi G(\nu ,\psi )\Vert _{X;Y}\le L(\xi )\Vert \phi (\nu )-\psi \Vert _X. \end{aligned}$$

3 Application to the Navier–Stokes Equations

In this section, we apply two-grid/multi-level algorithms to stationary NSE. First, we introduce some notations and function spaces. Let \({\mathrm{div}}\) and \(\nabla \) denote the divergence and gradient operators, respectively. For \(\varvec{v}=(v_1,\cdots ,v_d)\in {\mathbb {R}}^d\) and \(\varvec{\tau }=(\tau _{ij})_{d\times d}\in {\mathbb {R}}^{d\times d}\), let \(\varvec{\tau }_i=(\tau _{i1},\cdots ,\tau _{id})\) denote its ith row for \(i=1,\cdots ,d\) and define

$$\begin{aligned} \begin{aligned} {\mathbf{div}}\varvec{\tau }:&=({\mathrm{div}}\varvec{\tau }_1,\cdots ,{\mathrm{div}}\varvec{\tau }_d),&\quad \varvec{v}\cdot \varvec{\tau }:&=(\varvec{v}\cdot \varvec{\tau }_1,\cdots ,\varvec{v}\cdot \varvec{\tau }_d)\\ \nabla \varvec{v}:&=(\nabla v_1,\cdots ,\nabla v_d)^t=\left( \frac{\partial v_i}{\partial x_j}\right) _{d\times d},&\quad \varvec{\delta }:&=d\times d~\text{ unit } \text{ matrix }. \end{aligned} \end{aligned}$$

Given data \(\varvec{f}\) and \(\varvec{g}\), the stationary and incompressible NSE for the unknown velocity \(\varvec{u}\) and the pressure p reads

$$\begin{aligned} -\nu \triangle \varvec{u}+\varvec{u}\cdot \nabla \varvec{u}+\nabla p&=\varvec{f} \quad \text{ in }~\Omega , \end{aligned}$$
(18)
$$\begin{aligned} {\mathrm{div}}\varvec{u}&=0\quad \text{ in }~\Omega , \end{aligned}$$
(19)
$$\begin{aligned} \varvec{u}&=\varvec{g}\quad \text{ on }~\partial \Omega . \end{aligned}$$
(20)

The compatibility conditions read

$$\begin{aligned} \int _{\partial \Omega }\varvec{g} \cdot \varvec{n}\,ds=0\quad \text{ and } \quad \int _{\Omega }p\,d{\varvec{x}}=0, \end{aligned}$$

where \(\varvec{n}\) is the outward unit normal vector to the boundary.

Let \({\mathcal {A}}:{\mathbb {R}}^{d\times d}\rightarrow {\mathbb {R}}^{d\times d}\) be the deviatoric operator

$$\begin{aligned}{\mathcal {A}}\varvec{\tau }:=\varvec{\tau }-\frac{1}{d}(\text{ tr }\varvec{\tau })\varvec{\delta }.\end{aligned}$$

The pseudostress tensor \(\varvec{\sigma }:=\nu \nabla \varvec{u}-p\varvec{\delta }\) allows the pseudostress-velocity formulation of the NSE (18)–(19),

$$\begin{aligned} {\mathcal {A}}\left( \frac{\varvec{\sigma }}{\nu }\right) -\nabla \varvec{u}&=\varvec{0}\quad \text{ in }~\Omega , \end{aligned}$$
(21)
$$\begin{aligned} {\mathbf{div}}\varvec{\sigma }-\varvec{u}\cdot {\mathcal {A}}\left( \frac{\varvec{\sigma }}{\nu }\right)&=-\varvec{f} \quad \text{ in }~\Omega . \end{aligned}$$
(22)

The Eq. (21) is obtained from

$$\begin{aligned}\text{ tr }(\nabla \varvec{u})=\text{ div }\varvec{u}=0\quad \text{ and }\quad \text{ tr }\varvec{\sigma }=-dp.\end{aligned}$$

The compatibility condition \(\int _{\Omega }p\,d{\varvec{x}}=0\) implies

$$\begin{aligned}\int _{\Omega }\text{ tr }\varvec{\sigma }\,d{\varvec{x}}=0.\end{aligned}$$

Scaling pseudostress \(\varvec{\sigma }\) and data \(\varvec{f}\) by \(\varvec{\sigma }/\nu \rightarrow \varvec{\sigma }\) and \(\varvec{f}/\nu \rightarrow \varvec{f}\), (21)–(22) may be rewritten as

$$\begin{aligned} {\mathcal {A}}\varvec{\sigma }-\nabla \varvec{u}&=\varvec{0} \quad \text{ in }~\Omega , \end{aligned}$$
(23)
$$\begin{aligned} {\mathbf{div}}\varvec{\sigma }-\frac{1}{\nu }\varvec{u}\cdot {\mathcal {A}}\varvec{\sigma }&=-\varvec{f}\quad \text{ in }~\Omega . \end{aligned}$$
(24)

Assume that \(\Omega \) is a bounded, open, connected subset of \({\mathbb {R}}^d\) (\(d=2\) or 3) with the boundary \(\partial \Omega \) of class \(C^2\) or convex polygon/polyhedra. We use the standard notations and definitions for the Sobolev spaces \(W^{r,p}(\Omega )\) and \(W^{r,p}(\partial \Omega )\) for \(r\ge 0\) and \(p\in [1,\infty ]\). The standard associated norms are denoted by \(\Vert \cdot \Vert _{r,p}:=\Vert \cdot \Vert _{r,p,\Omega } \) and \(\Vert \cdot \Vert _{r,p,\partial \Omega }\). For \(r=0\), \(W^{r,p}(\Omega )\) coincides with \(L^p(\Omega )\). Moreover, the space \(W^{r,2}(\Omega )\) will be written in the form \(H^r(\Omega )\). Let

$$\begin{aligned} H({\mathrm{div}};\Omega )^d&=\{{\varvec{\tau }}\in L^2(\Omega )^{d\times d} :{\mathbf{div}}{\varvec{\tau }}\in L^2(\Omega )^d \},\\ W^{0,3}({\mathrm{div}};\Omega )^d&=\{{\varvec{\tau }}\in W^{0,3}(\Omega )^{d\times d} :{\mathbf{div}}{\varvec{\tau }}\in W^{0,3}(\Omega )^d \},\\ L_0^2(\Omega )&=\left\{ q\in L^2(\Omega ) \,\Big | \int _{\Omega } q\,d{\varvec{x}}=0\right\} . \end{aligned}$$

We define spaces

$$\begin{aligned} {\widehat{H}}({\mathrm{div}};\Omega )^d&=\{\varvec{\tau }\in H({\mathrm{div}};\Omega )^{d}\,|\,\int _{\Omega }\text{ tr }\varvec{\tau }\,d\varvec{x}=0 \},\\ {\widehat{W}}^{0,3}({\mathrm{div}};\Omega )^d&=\{\varvec{\tau }\in W^{0,3}({\mathrm{div}};\Omega )^d\,|\, \int _{\Omega }\text{ tr }\varvec{\tau }\,d\varvec{x}=0\},\\ {{\widehat{H}}^{r}(\Omega )^{d\times d}}&=\{\varvec{\tau }\in H^{r}(\Omega )^{d\times d}\,|\, \int _{\Omega }\text{ tr }\varvec{\tau }\,d\varvec{x}=0\}. \end{aligned}$$

Then, the mixed variational problem of the pseudostress-velocity formulation (23)–(24) is to find \((\varvec{\sigma },\varvec{u})\in {\widehat{W}}^{0,3}({\mathrm{div}};\Omega )^d\times L^2(\Omega )^d\) such that

$$\begin{aligned} ({\mathcal {A}}\varvec{\sigma },\varvec{\tau })+({\mathbf{div}}\varvec{\tau },\varvec{u})&=g_*(\varvec{\tau }) \quad \forall \varvec{\tau }\in {\widehat{H}}({\mathrm{div}};\Omega )^d, \end{aligned}$$
(25)
$$\begin{aligned} ({\mathbf{div}}\varvec{\sigma },\varvec{v})-\frac{1}{\nu }(\varvec{u}\cdot {\mathcal {A}}\varvec{\sigma },\varvec{v})&=f_*(\varvec{v})\quad \forall \varvec{v}\in L^2(\Omega )^d, \end{aligned}$$
(26)

where \((\cdot , \cdot )\) denotes the \(L^2(\Omega )^{d\times d}\) or \(L^2(\Omega )^{d}\) inner product, \(g_*(\varvec{\tau })=\int _{\partial \Omega }(\varvec{n}\cdot \varvec{\tau })\cdot \varvec{g}\, d{s}\) and \(f_*(\varvec{v})=-\int _{\Omega }\varvec{f}\cdot \varvec{v}\,d\varvec{x}\). Note that if we use the trial space \({\widehat{W}}^{0,3}({\mathrm{div}};\Omega )^d\times L^2(\Omega )^d\), then we can guarantee the nonlinear term, \(\varvec{u}\cdot {\mathcal {A}}\varvec{\sigma }\), to be in \(L^2(\Omega )^d\) from the imbedding theorem (see Remark 2.1 in [6]). The well-posedness and uniqueness of system (25)–(26) is established by the following well known theorem (see [6]).

Theorem 3.1

Let \(\varvec{f}\) be in \(L^2(\Omega )^d\). For \(\varvec{g}\in H^{3/2}(\partial \Omega )^d\), system (25)–(26) has solutions, \(({\varvec{\sigma }},{\varvec{u}})\), belonging to \(H^{1}(\Omega )^{d\times d}\times H^{2}(\Omega )^d\). Moreover, if \(\nu > \nu _0(\Omega ;\varvec{f},\varvec{g})\) (see [6] or [13] for specific value of \(\nu _0\)), then the solution \(({\varvec{\sigma }},{\varvec{u}})\) is unique.

3.1 A Priori Error Analysis for the NSE

The a priori error for the problem (25)–(26) was analyzed in [21]. The following details are only given for the reader’s convenience. We let

$$\begin{aligned} G(\nu ,\phi ):=\left( -\varvec{g},-\frac{1}{\nu }\varvec{u}\cdot {\mathcal {A}}\varvec{\sigma }+\varvec{f}\right) \end{aligned}$$

for \(\phi :=(\varvec{\sigma },\varvec{u})\). We set

$$\begin{aligned} Y:=H^{3/2}(\partial \Omega )^d\times L^2(\Omega )^d. \end{aligned}$$

Note that \(\varvec{f}\in L^2(\Omega )^d\) and \(\varvec{g}\in H^{3/2}(\partial \Omega )^d\) guarantee that the exact solution \((\varvec{\sigma },\varvec{u})\) of the problem (25)–(26) is in \(H^1(\Omega )^{d\times d}\times H^2(\Omega )^d\) (see [6]). Let \(\Lambda \subset (0,\infty )\) be a compact interval. We consider \(Z=Y\). Then condition (H4) implies condition (H3). Note the Fréchet derivative of G at \((\nu ,\phi )=(\nu ,(\varvec{\sigma },\varvec{u}))\) is

$$\begin{aligned} D_{\phi }G(\nu ,\phi )(\psi )=\left( \varvec{0},-\frac{1}{\nu }(\varvec{u}\cdot {\mathcal {A}}\varvec{\tau }+\varvec{v}\cdot {\mathcal {A}}\varvec{\sigma } )\right) \end{aligned}$$

for any \(\psi =(\varvec{\tau },\varvec{v})\). From the definition, G belongs to \({\mathcal {C}}^2\)-class clearly. So, the hypothesis (H1) holds. To guarantee that \(D_{\phi }G(\nu ,\phi )(\psi )\in Z\) for all \(\psi \), we let

$$\begin{aligned} X:=L^3(\Omega )^{d\times d}\times L^6(\Omega )^d. \end{aligned}$$

In fact, note that for any \(\psi =(\varvec{\tau },\varvec{v})\in X\),

$$\begin{aligned} \Vert \varvec{u}\cdot {\mathcal {A}}\varvec{\tau }+\varvec{v}\cdot {\mathcal {A}}\varvec{\sigma } \Vert _{0,2}&\le \Vert \varvec{u}\cdot {\mathcal {A}}\varvec{\tau }\Vert _{0,2}+\Vert \varvec{v}\cdot {\mathcal {A}}\varvec{\sigma } \Vert _{0,2}\\&\le \Vert \varvec{u}\Vert _{0,6}\Vert \varvec{\tau }\Vert _{0,3}+\Vert \varvec{v}\Vert _{0,6}\Vert \varvec{\sigma }\Vert _{0,3}\\&\le (\Vert \varvec{\sigma }\Vert _{0,3}^2+\Vert \varvec{u}\Vert _{0,6}^2)^{1/2}(\Vert \varvec{\tau }\Vert _{0,3}^2+\Vert \varvec{v}\Vert _{0,6}^2)^{1/2}. \end{aligned}$$

Here we used the triangle inequality and the Hölder inequality.

So, the hypothesis (H2) holds.

We let

$$\begin{aligned} {\mathscr {X}}:=L^2(\Omega )^{d\times d}\times L^2(\Omega )^d \end{aligned}$$

and introduce the linear solution operator

$$\begin{aligned} S: Y&\rightarrow {\mathscr {X}} \\ (\varvec{g}^*,\varvec{f}^*)&\longrightarrow S(\varvec{g}^*,\varvec{f}^*)=(\varvec{\sigma },\varvec{u}) \end{aligned}$$

defined by the solution of

$$\begin{aligned} ({\mathcal {A}}\varvec{\sigma },\varvec{\tau })+({\mathbf{div}}\varvec{\tau }, \varvec{u})&=\int _{\partial \Omega } (\varvec{n}\cdot \varvec{\tau })\cdot \varvec{g}^*\, ds \quad \forall \varvec{\tau }\in {\widehat{H}}({\mathrm{div}};\Omega )^d, \end{aligned}$$
(27)
$$\begin{aligned} ({\mathbf{div}}\varvec{\sigma },\varvec{v})&=-\int _{\Omega }\varvec{f}^*\cdot \varvec{v}\,d\varvec{x}\quad \forall \varvec{v}\in L^2(\Omega )^d. \end{aligned}$$
(28)

The following result is well-known (see Lemma 5.1 in [6]).

Lemma 3.2

For any \((\varvec{g}^*,\varvec{f}^*)\in Y\), Stokes equations (27)–(28) have a unique solution \((\varvec{\sigma },\varvec{u})=S(\varvec{g}^*,\varvec{f}^*)\) which is in \(H^1(\Omega )^{d\times d}\times H^2(\Omega )^d \hookrightarrow {\mathscr {X}}\).

Clearly, the pair \((\nu ,\phi )\in \Lambda \times X\) is a solution of

$$\begin{aligned} F(\nu ,\phi ):=\phi +SG(\nu ,\phi )=0 \end{aligned}$$
(29)

if and only if the function \(\phi =(\varvec{\sigma },\varvec{u})\) is a solution of (25)–(26).

Theorem 3.3

For any \((\varvec{g},\varvec{f})\in Y\), there is a \(\nu _0(\Omega ;\varvec{f},\varvec{g})\) such that if \(\nu >\nu _0(\Omega ;\varvec{f},\varvec{g})\), where \(\nu _0\) is the fixed value stated in Theorem 3.1, then the nonlinear problem (29) has a branch of nonsingular solutions.

Proof

See [21]. \(\square \)

We now consider mixed finite element approximations. Let \(\{{\mathcal {T}}_h\}\) be a family of shape-regular triangulations of \({\overline{\Omega }}\) by triangles (\(d=2\)) and tetrahedra (\(d=3\)), respectively, of diameter \(h_T\), and let \(h:=\max _{T\in {\mathcal {T}}_h}h_T\) denote the mesh size. Associated with \({\mathcal {T}}_h\), we define a finite subspace of X by

$$\begin{aligned} X_h:={\widehat{RT}}_k^d({\mathcal {T}}_h)\times P_k^d({\mathcal {T}}_h), \end{aligned}$$

where \(P_k^d:=\Pi _{i=1}^dP_k\) (i.e., product space of polynomials of degree k) and

$$\begin{aligned} \widehat{RT_k^d}({\mathcal {T}}_h):=\left\{ \varvec{\tau }\in RT_k^d({\mathcal {T}}_h) \,\big |\, \int _{\Omega } \text{ tr }\varvec{\tau }\,d\varvec{x}=0 \right\} \end{aligned}$$

for \(RT_k^d({\mathcal {T}}_h):=\Pi _{i=1}^dRT_k({\mathcal {T}}_h)\) (i.e., product space of Raviart–Thomas space of order k).

For short notation on generic constants C independent of the local or global mesh-sizes, for any expressions A and B,

$$\begin{aligned} A\lesssim B\quad \mathrm{abbreviates}\quad A\le CB. \end{aligned}$$

We introduce an interpolation operator over the space W, where \(W:={\widehat{H}}^1(\Omega )^{d\times d}\) or \(W:={\widehat{H}}(\text{ div };\Omega )^d\cap L^s(\Omega )^{d\times d}\) for some \(s>2\). Let \({\tilde{\Pi }}_h\) denote Raviart–Thomas interpolation operator (see [3]) associated with the degrees of freedom onto \(\widehat{RT_k^d}({\mathcal {T}}_h)+{ span}\{\varvec{\delta }\}\). We define \(\Pi _h:W\rightarrow \widehat{RT_k^d}({\mathcal {T}}_h)\) by

$$\begin{aligned} \Pi _h{\varvec{\tau }}={\tilde{\Pi }}_h{\varvec{\tau }}-\frac{\int _\Omega (\mathrm{tr}{\tilde{\Pi }}_h{\varvec{\tau }})d{\varvec{x}}}{d|\Omega |}\varvec{\delta }, {\quad \forall {\varvec{\tau }}\in W,} \end{aligned}$$
(30)

where \(|\Omega |=\int _\Omega d{\varvec{x}}\). We notice that \(\int _\Omega (\mathrm{tr}\Pi _h{\varvec{\tau }}) d{\varvec{x}}=0\). Let \({\varvec{P}}_h\) be the \(L^2\)-projection onto \(P_k^d({\mathcal {T}}_h)\) with the well known approximation property:

$$\begin{aligned} \Vert {\varvec{P}}_h{\varvec{v}}-{\varvec{v}}\Vert _{0,p,T}\lesssim h_T^r|{\varvec{v}}|_{r,p,T},\quad 0\le r\le k+1,\quad \forall {\varvec{v}}\in W^{r,p}(T)^d. \end{aligned}$$
(31)

Then the following lemma holds (see [5] for details).

Lemma 3.4

The commutative property \({\mathbf{div}}\Pi _h={\varvec{P}}_h{\mathbf{div}}\) holds. Furthermore,

we have

$$\begin{aligned} \Vert {\varvec{\tau }}-\Pi _h{\varvec{\tau }}\Vert _{0,2}&\lesssim h^r|{\varvec{\tau }}|_{r,2},\quad \mathrm{for}\quad 1\le r\le k+1, \end{aligned}$$
(32)
$$\begin{aligned} \Vert {\mathbf{div}}{\varvec{\tau }}-{\mathbf{div}}(\Pi _h{\varvec{\tau }})\Vert _{0,2}&\lesssim h^r |{\mathbf{div}}{\varvec{\tau }}|_{r,2},\quad \mathrm{for}\quad 0\le r\le k+1. \end{aligned}$$
(33)

The mixed finite element method for approximation of the solution of (25)–(26) is defined by finding \((\varvec{\sigma }^h,\varvec{u}^h)\in X_h\) such that

$$\begin{aligned} ({\mathcal {A}}\varvec{\sigma }^h,\varvec{\tau })+(\varvec{u}^h,{\mathbf{div}}\varvec{\tau })&=g_*(\varvec{\tau }) \quad \forall \varvec{\tau }\in {\widehat{RT}}_k^d({\mathcal {T}}_h), \end{aligned}$$
(34)
$$\begin{aligned} ({\mathbf{div}}\varvec{\sigma }^h,\varvec{v})-(\frac{1}{\nu }\varvec{u}^h\cdot {\mathcal {A}}\varvec{\sigma }^h,\varvec{v})&=f_*(\varvec{v})\quad \forall \varvec{v}\in P_k^d({\mathcal {T}}_h). \end{aligned}$$
(35)

Next, we define a discrete solution operator \(S_h:Y\rightarrow X_h\) by

$$\begin{aligned} S_h(\varvec{g}^*,\varvec{f}^*)=(\varvec{\sigma }_h,\varvec{u}_h)\in X_h, \end{aligned}$$

where \((\varvec{\sigma }_h,\varvec{u}_h)\in X_h\) is the solution of the discrete counterpart of (27)–(28):

$$\begin{aligned} ({\mathcal {A}}\varvec{\sigma }_h,\varvec{\tau })+(\varvec{u}_h,{\mathbf{div}}\varvec{\tau })&=\int _{\partial \Omega } (\varvec{n}\cdot \varvec{\tau })\cdot \varvec{g}^*\, ds \quad \forall \varvec{\tau }\in {\widehat{RT}}_k^d({\mathcal {T}}_h), \end{aligned}$$
(36)
$$\begin{aligned} ({\mathbf{div}}\varvec{\sigma }_h,\varvec{v})&=-\int _{\Omega }\varvec{f}^*\cdot \varvec{v}\,d\varvec{x}\quad \forall \varvec{v}\in P_k^d({\mathcal {T}}_h). \end{aligned}$$
(37)

Now, the discrete nonlinear problem of (29) is to find \(\phi ^h=(\varvec{\sigma }^h,\varvec{u}^h)\) in \(X_h\) such that

$$\begin{aligned} F_h(\nu ,\phi ^h):=\phi ^h+S_hG(\nu ,\phi ^h)=0. \end{aligned}$$
(38)

Note that \(\phi ^h=(\varvec{\sigma }^h,\varvec{u}^h)\) is the solution of (38) if and only if \(\phi ^h\in X_h\) satisfies (34)–(35).

Theorem 3.5

For any \((\varvec{g}^*,\varvec{f}^*)\in Y\), let \((\varvec{\sigma },\varvec{u})=S(\varvec{g}^*,\varvec{f}^*)\) and \((\varvec{\sigma }_h,\varvec{u}_h)=S_h(\varvec{g}^*,\varvec{f}^*)\) be the solution of (27)–(28) and (36)–(37), respectively. Then

$$\begin{aligned} \lim _{h\rightarrow 0}\Vert S-S_h\Vert _{Z;{\mathscr {X}}}=0. \end{aligned}$$

Moreover, assume that \((\varvec{\sigma },\varvec{u})\in H^r(\Omega )^{d\times d}\times H^r(\Omega )^d\) for \(1\le r\le k+1\); then we have

$$\begin{aligned} \Vert S(\varvec{g}^*,\varvec{f}^*)-S_h(\varvec{g}^*,\varvec{f}^*)\Vert _{{\mathscr {X}}}&\lesssim h^{r}(\Vert \varvec{\sigma }\Vert _{r,2}+\Vert \varvec{u}\Vert _{r,2})\\&\lesssim h^{r}\left( \Vert \varvec{f}^*\Vert _{r-2,2}+\Vert \varvec{g}^*\Vert _{r-\frac{1}{2},2,\partial \Omega }\right) . \end{aligned}$$

Proof

See [21]. \(\square \)

Theorem 3.5 implies that the hypotheses (H3) and (H4) hold. We get the following theorem.

Theorem 3.6

Let \((\varvec{g},\varvec{f})\in Y\). Assume that \(\nu >\nu _0(\Omega ;\varvec{f},\varvec{g})\), where \(\nu _0\) is the fixed value stated in Theorem 3.3, and that \((\nu ,\phi (\nu ))=(\nu ,(\varvec{\sigma }(\nu ),\varvec{u}(\nu )))\) is a branch of nonsingular solutions of (29). Then for h sufficiently small, there exists a neighborhood \({\mathcal {O}}\) of the origin in X and a unique \({\mathcal {C}}^2\)-function \(\nu \rightarrow \phi ^h(\nu )\in X_h\) such that \(\{ (\nu ,\phi ^h(\nu )) : \nu \in \Lambda \}\) is a branch of nonsingular solutions of (38) and that \(\phi (\nu )-\phi ^h(\nu )\in {\mathcal {O}}\).

Moreover, assume that \((\varvec{\sigma }(\nu ),\varvec{u}(\nu ))\in H^r(\Omega )^{d\times d}\times H^r(\Omega )^d\) for \(1\le r\le k+1\). Then we have

$$\begin{aligned}&\Vert \phi (\nu )-\phi ^h(\nu )\Vert _X:=\Vert \varvec{\sigma }(\nu )-\varvec{\sigma }^h(\nu )\Vert _{0,3}\nonumber \\&\quad +\Vert \varvec{u}(\nu )-\varvec{u}^h(\nu )\Vert _{0,6}\lesssim h^{r}(\Vert \varvec{\sigma }(\nu )\Vert _{r,2}+\Vert \varvec{u}(\nu )\Vert _{r,2}), \end{aligned}$$
(39)

where \(\phi ^h(\nu )=(\varvec{\sigma }^h(\nu ),\varvec{u}^h(\nu ))\).

Proof

See [21]. \(\square \)

Now, we apply two-grid/multi-level methods to NSE based on pseudostress-velocity formulation. For simplicity, we omit \(\nu \) from the functions \(\phi (\nu ), ~\phi _h(\nu )\) and \(\phi ^H(\nu )\). Note that the Fréchet derivative of F at \((\nu ,\phi )=(\nu ,(\varvec{\sigma },\varvec{u}))\) is

$$\begin{aligned}D_{\phi }F(\nu ,\phi )(\psi )=(I+SD_{\phi }G(\nu ,\phi ))(\psi )=\psi +S\left( \varvec{0},-\frac{1}{\nu }(\varvec{u}\cdot {\mathcal {A}}\varvec{\tau }+\varvec{v}\cdot {\mathcal {A}}\varvec{\sigma } )\right) \end{aligned}$$

for any \(\psi =(\varvec{\tau },\varvec{v})\in X\). First, we consider two-grid algorithm for NSE.

Algorithm 1: Two-grid Algorithm for NSE

Step 1 (Nonlinear Solve). Find \({\phi ^H\in X_H}\) such that \({\phi ^H+S_HG(\nu ,\phi ^H)=0}\).

Step 2 (Linear Solve). Find \({\phi _h\in X_h}\) such that \({(I+S_hD_{\phi }G(\nu ,\phi ^H))(\phi _h-\phi ^H)=}\)\({-(\phi ^H+S_hG(\nu ,\phi ^H))}\).

In Step 1, we solve the following nonlinear problem on coarse grid: Find \(\phi ^H=(\varvec{\sigma }^H,\varvec{u}^H)\in X_H\) such that

$$\begin{aligned} ({\mathcal {A}}\varvec{\sigma }^H,\varvec{\tau })+({\mathbf{div}}\varvec{\tau },\varvec{u}^H)&=g_*(\varvec{\tau }) \quad \forall \varvec{\tau }\in {\widehat{RT}}_k^d({\mathcal {T}}_H), \end{aligned}$$
(40)
$$\begin{aligned} ({\mathbf{div}}\varvec{\sigma }^H,\varvec{v})-\frac{1}{\nu }(\varvec{u}^H\cdot {\mathcal {A}}\varvec{\sigma }^H,\varvec{v})&=f_*(\varvec{v})\quad \forall \varvec{v}\in P_k^d({\mathcal {T}}_H). \end{aligned}$$
(41)

We recall \(g_*(\varvec{\tau })=\int _{\partial \Omega }(\varvec{n}\cdot \varvec{\tau })\cdot \varvec{g}\, d\varvec{s}\) and \(f_*(\varvec{v})=-\int _{\Omega }\varvec{f}\cdot \varvec{v}\,d\varvec{x}\).

Using the fact that

$$\begin{aligned} (I+&S_hD_{\phi }G(\nu ,\phi ^H))(\phi _h-\phi ^H)=-(\phi ^H+S_hG(\nu ,\phi ^H))\\ \Leftrightarrow ~&\phi _h=S_h(D_{\phi }G(\nu ,\phi ^H)(\phi ^H)-G(\nu ,\phi ^H)-D_{\phi }G(\nu ,\phi ^H)(\phi _h)), \end{aligned}$$

we arrive at the linearized system in Step 2: Find \(\phi _h=(\varvec{\sigma }_h,\varvec{u}_h)\in X_h\) such that

$$\begin{aligned} ({\mathcal {A}}\varvec{\sigma }_h,\varvec{\tau })+({\mathbf{div}}\varvec{\tau },\varvec{u}_h)&=g_*(\varvec{\tau }) \quad \forall \varvec{\tau }\in {\widehat{RT}}_k^d({\mathcal {T}}_h), \end{aligned}$$
(42)
$$\begin{aligned} ({\mathbf{div}}\varvec{\sigma }_h,\varvec{v})-\frac{1}{\nu }\left( \varvec{u}^H\cdot {\mathcal {A}}\varvec{\sigma }_h+ \varvec{u}_h\cdot {\mathcal {A}}\varvec{\sigma }^H,\varvec{v}\right)&=-\frac{1}{\nu }(\varvec{u}^H\cdot {\mathcal {A}}\varvec{\sigma }^H,\varvec{v}) +f_*(\varvec{v})\quad \forall \varvec{v}\in P_k^d({\mathcal {T}}_h). \end{aligned}$$
(43)

Note that the standard mixed finite element approximation for convection-dominated diffusion problems may produce solutions with spurious oscillations. When the viscosity \(\nu \) is small, we can use a upstream weighting scheme for the mixed finite element method (see e.g. [17, 25]).

Using the results of Theorems 2.5 and 3.6, we can easily obtain the following convergence theorem for Algorithm 1 for the NSE.

Theorem 3.7

Let \((\varvec{g},\varvec{f})\in Y\). Assume that \(\nu >\nu _0(\Omega ;\varvec{f},\varvec{g})\), where \(\nu _0\) is the fixed value stated in Theorem 3.3 and that \((\nu ,\phi (\nu ))=(\nu ,(\varvec{\sigma }(\nu ),\varvec{u}(\nu )))\) is a branch of nonsingular solutions of (29). Let \(\phi ^H(\nu )=(\varvec{\sigma }^H(\nu ),\varvec{u}^H(\nu ))\) be the solution of (40)–(41) and \(\phi _h(\nu )=(\varvec{\sigma }_h(\nu ),\varvec{u}_h(\nu ))\) the solution of (42)–(43). If \((\varvec{\sigma }(\nu ),\varvec{u}(\nu ))\in H^r(\Omega )^{d\times d}\times H^r(\Omega )^d\) for \(1\le r\le k+1\), then we have

$$\begin{aligned} \Vert \varvec{\sigma }(\nu )-\varvec{\sigma }_h(\nu )\Vert _{0,3}+\Vert \varvec{u}(\nu )-\varvec{u}_h(\nu )\Vert _{0,6}\lesssim (H^{2r}+h^{r})(\Vert \varvec{\sigma }(\nu )\Vert _{r,2}+\Vert \varvec{u}(\nu )\Vert _{r,2}). \end{aligned}$$
(44)

This convergence theorem indicates that optimal fine scale convergence is retained under the proper relation between meshes, i.e., \(h={{\mathcal {O}}(H^2)}\).

Next, we consider multi-level algorithm. Let \({{{\mathcal {T}}}}_0={{{\mathcal {T}}}}_H\) and \({{{\mathcal {T}}}}_1={{{\mathcal {T}}}}_h\). We generate a shape-regular triangulation \({{{\mathcal {T}}}}_{j+1}\) from \(\mathcal{T}_{j}~(j\ge 1)\). By replacing H and h in (40)–(41) and (42)–(43) with \(h_{j-1}\) and \(h_j\), respectively, we arrive at Algorithm 2 applied to the NSE. For convenience, we omit \(\nu \) from the functions \(\phi _j(\nu )\).

Algorithm 2: Multi-level Algorithm for NSE

Step 1 (Nonlinear Solve). Find \({\phi _0=(\varvec{\sigma }_0,\varvec{u}_0)\in X_0}\) where \({X_0:={\widehat{RT}}_k^d({\mathcal {T}}_{0})\times P_k^d({\mathcal {T}}_{0})}\) such that

$$\begin{aligned} ({\mathcal {A}}\varvec{\sigma }_0,\varvec{\tau })+({\mathbf{div}}\varvec{\tau },\varvec{u}_0)&=g_*(\varvec{\tau }) \quad \forall \varvec{\tau }\in {\widehat{RT}}_k^d({\mathcal {T}}_{0}), \end{aligned}$$
(45)
$$\begin{aligned} ({\mathbf{div}}\varvec{\sigma }_0,\varvec{v})-\frac{1}{\nu }(\varvec{u}_0\cdot {\mathcal {A}}\varvec{\sigma }_0,\varvec{v})&=f_*(\varvec{v})\quad \forall \varvec{v}\in P_k^d({\mathcal {T}}_{0}), \end{aligned}$$
(46)

Step 2 (Linear Solve). For \(j\ge 1\), find \(\phi _j=(\varvec{\sigma }_j,\varvec{u}_j)\in X_j:={\widehat{RT}}_k^d({\mathcal {T}}_{j})\times P_k^d({\mathcal {T}}_{j})\) such that

$$\begin{aligned}&({\mathcal {A}}\varvec{\sigma }_j,\varvec{\tau })+({\mathbf{div}}\varvec{\tau },\varvec{u}_j)=g_*(\varvec{\tau }) \quad \forall \varvec{\tau }\in {\widehat{RT}}_k^d({\mathcal {T}}_{j}), \end{aligned}$$
(47)
$$\begin{aligned}&({\mathbf{div}}\varvec{\sigma }_j,\varvec{v})-\frac{1}{\nu }\left( \varvec{u}_{j-1}\cdot {\mathcal {A}}\varvec{\sigma }_j+ \varvec{u}_j\cdot {\mathcal {A}}\varvec{\sigma }_{j-1},\varvec{v}\right) \nonumber \\&\quad =-\frac{1}{\nu }(\varvec{u}_{j-1}\cdot {\mathcal {A}}\varvec{\sigma }_{j-1},\varvec{v})+f_*(\varvec{v})\quad \forall \varvec{v}\in P_k^d({\mathcal {T}}_{j}). \end{aligned}$$
(48)

Remark 3.8

It should be noted that the above algorithm can be applied even if \(\mathcal{T}_{j}\) is not a uniform refinement of \({{{\mathcal {T}}}}_{j-1}\). If \(\mathcal{T}_{j}\) with diameter \(h_{j}\) is uniformly refined triangulations from the previous triangulations \({{{\mathcal {T}}}}_{j-1}\) with diameter \(h_{j-1} \), from the results of Theorems 2.7 and 3.6, the following theorem for Algorithm 2 can be easily proved.

Theorem 3.9

Let \((\varvec{g},\varvec{f})\in Y\). Assume that \(\nu >\nu _0(\Omega ;\varvec{f},\varvec{g})\), where \(\nu _0\) is the fixed value stated in Theorem 3.3 and that \((\nu ,\phi (\nu ))=(\nu ,(\varvec{\sigma }(\nu ),\varvec{u}(\nu )))\) is a branch of nonsingular solutions of (29). Let \(\phi _0(\nu )=(\varvec{\sigma }_0(\nu ),\varvec{u}_0(\nu ))\) be the solution of (45)–(46) and \(\phi _j(\nu )=(\varvec{\sigma }_j(\nu ),\varvec{u}_j(\nu )),~(j\ge 1)\) solutions of (47)–(48). If \((\varvec{\sigma }(\nu ),\varvec{u}(\nu ))\in H^r(\Omega )^{d\times d}\times H^r(\Omega )^d\) for \(1\le r\le k+1\), then we have

$$\begin{aligned} \Vert \varvec{\sigma }(\nu )-\varvec{\sigma }_j(\nu )\Vert _{0,3}+\Vert \varvec{u}(\nu )-\varvec{u}_j(\nu )\Vert _{0,6}\lesssim (h_{j-1}^{2r}+h_j^{r})(\Vert \varvec{\sigma }(\nu )\Vert _{r,2}+\Vert \varvec{u}(\nu )\Vert _{r,2}). \end{aligned}$$
(49)

4 Numerical Experiments

In this section, we perform various numerical experiments to validate the theory of the unified framework for multi-level algorithm presented in Sect. 2. In particular, we will numerically check that the multi-level algorithm has the quadratic convergence in the sense of (14): For C independent of \(h_j\)

$$\begin{aligned}\Vert \phi ^{h_j}(\nu )-\phi _j(\nu )\Vert _{X}\le C\Vert \phi ^{h_j}(\nu )-\phi _{j-1}(\nu )\Vert _{X}^2. \end{aligned}$$

For this, we first consider simple semilinear elliptic problems: For \(a=2\) or 3,

$$\begin{aligned} \begin{aligned} -\triangle \phi +\phi ^a&=f\quad \mathrm{in}~~\Omega ,\\ \phi&=0\quad \mathrm{on}~~\partial \Omega , \end{aligned} \end{aligned}$$
(50)

whose weak formulation is given as: Find \(\phi \in X:=H^1_0(\Omega )\) such that

$$\begin{aligned} (\nabla \phi , \nabla \psi ) +(\phi ^a, \psi ) =0,\quad \forall \psi \in X. \end{aligned}$$

Let \(X_j\subset H^1_0(\Omega )\) be a conforming finite element space of piecewise polynomials of degree \(k\ge 1\) defined on the triangulation \({{{\mathcal {T}}}}_j\). By inspecting [20] the reader can verify the fact that this semilinear problem fits into multi-level algorithm and satisfies the theories introduced in Sect. 2. There, it is proved that the results of Theorem 2.7 hold for \(L^2\)-norm as well as \(H^1\)-norm. Moreover, we see that if the exact solution \(\phi \) is in \(W^2_p(\Omega )\) with \(p>2\) for \(d=2\) and \(p=\frac{12}{5}\) for \(d=3\), then the following result holds:

$$\begin{aligned} \Vert \phi ^{h_j}(\nu )-\phi _j(\nu )\Vert _{1,2}\le C\Vert \phi ^{h_j}(\nu )-\phi _{j-1}(\nu )\Vert _{0,p}^2 ~\left( \le C h_{j-1}^4\right) , \end{aligned}$$
(51)

which shows improvement \({{\mathcal {O}}}(h^{4}_{j-1})\) compared with the result \({{\mathcal {O}}}(h^{2}_{j-1})\) via (14).

Example 4.1

We consider the semilinear problem (50) with \(\Omega =(0,1)\times (0,1)\) and the function f determined from the exact solution

$$\begin{aligned} \phi (x,y)=\sin (\pi x)\sin (\pi y). \end{aligned}$$

We present the various numerical results for \(L^2\)-norm and \(H^1\)-norm in Tables 1 and 2. We recall that \(\phi _j\in X_j\) are solutions obtained by Algorithm 2 and \(\phi ^{h_j}\in X_j\) are nonsingular solutions of (2) computed by a nonlinear solver (NLS). The NLS directly find the approximate solution of the given nonlinear problems by using Newton’s algorithm. Here, we let \(X_j\) be a conforming finite element space of piecewise polynomials of degree \(k=1\). Table 1 displays various errors, and order of convergence (CO) for relevant errors, and run-times for the Algorithm 2 (multi-RT) and run-times for the NLS (NLS-RT). From Table 1 we see that multi-RT for achieving the comparable accuracy of approximate solutions is shorter than NLS-RT, which shows that the multi-level algorithm outperforms NLS. Also, the computed order of convergence for the multi-level algorithm listed in Table 1 supports the theory developed in this paper. Table 2 shows the result when \(p=4\) as a typical outcome of our other (undisplayed) numerical experiments: The ratios \(\displaystyle \frac{\Vert \phi ^{h_j}-\phi _{j}\Vert _{0,2}}{\Vert \phi ^{h_j}-\phi _{j-1}\Vert _{0,2}^2}\) and \(\displaystyle \frac{\Vert \phi ^{h_j}-\phi _{j}\Vert _{1,2}}{\Vert \phi ^{h_j}-\phi _{j-1}\Vert _{0,p}^2}\) are almost constant and thus validate that the multi-level algorithm has the quadratic convergence in the sense of (14) and (51), respectively.

Table 1 Convergence order (CO) and run-times (multi-RT & NLS-RT) for Example 4.1
Table 2 Quadratic convergence in the sense of (14) and (51) for Example 4.1

Next, we consider the Navier–Stokes equations (18)–(20) with the exact solution \(\phi =({\varvec{\sigma }},{\varvec{u}})\) and the finite element spaces \(X_j:={\widehat{RT}}_0^2({\mathcal {T}}_j)\times P_0^2({\mathcal {T}}_j)\).

Example 4.2

We consider NSE (18)–(20) with \(\Omega =(0,1)\times (0,1)\) and \(\nu =1\). The exact solutions \({\varvec{u}}=(u_1, u_2)\) and p

$$\begin{aligned} u_1(x,y)&=2\pi \cos (\pi y)\sin (\pi x)^2\sin (\pi y),\\ u_2(x,y)&=-2\pi \cos (\pi x)\sin (\pi x)\sin (\pi y)^2,\\ p(x,y)&=\cos (\pi x)\cos (\pi y) \end{aligned}$$

specify data \({\varvec{f}}\) and \(\varvec{g}\).

Table 3 Convergence order, quadratic convergence in the sense of (14), and run-times for Example 4.2

Table 3, as in Tables 1 and 2, shows that the multi-level algorithm has the optimal order of convergence, the quadratic convergence in the sense of (14) and shorter run-time than NLS.

Example 4.3

(A solution with a boundary layer). We consider NSE (18)–(20) with \(\Omega =(0,1)\times (0,1)\) and \(\nu =1\). The exact solutions \({\varvec{u}}=(u_1, u_2)\) and p

$$\begin{aligned} u_1(x,y)&= \frac{1}{53793}x^2y(x - 1)^2(y - 1)(4y - 10y + 10y^2 - 2)e^{10(x + y)},\\ u_2(x,y)&= -\frac{1}{53793}xy^2(x - 1)(y - 1)^2(4x - 10x +10x^2 - 2)e^{10(x + y)},\\ p(x,y)&=\cos (\pi x)\cos (\pi y). \end{aligned}$$

specify data \({\varvec{f}}\) and \(\varvec{g}\).

Table 4 Convergence order, quadratic convergence in the sense of (14), and run-times for Example 4.3

In Table 4, the errors \(\Vert \phi -\phi _j\Vert _X\) and \(\Vert \phi -\phi ^{h_j}\Vert _X\) for the problem with a boundary layer solution of Example 4.3 up to low levels (\(j\le 2\)) exhibit non-optimal convergence behavior since meshes in low levels are in the pre-asymptotic range. We see that the errors have optimal order of convergence by carrying out more uniform refinement (\(j\ge 3\)). However, for \(j\ge 3\) the finite element space \(X_j\) may require too many degrees of freedom, in which case the adaptivity can be incorporated to multi-level algorithms for efficient computation. Indeed, this idea is currently under investigation, which will appear elsewhere. Moreover, since the ratio \(\frac{\Vert \phi ^{h_j}-\phi _{j}\Vert _{X}}{\Vert \phi ^{h_j}-\phi _{j-1}\Vert _{X}^2}\) has almost constant value, we can say that the multi-level algorithm applied to the problem with layer also has quadratic convergence in the sense of (14).

5 Conclusion

We have developed two-grid/multi-level algorithms for a class of nonlinear problems. An important aspect of the proposed algorithms is the use of mesh refinement in conjunction with Newton-type methods for system solution in contrast to the usual Newton’s method on a fixed mesh. The pseudostress-velocity formulation of the stationary, incompressible Navier–Stokes equations as well as the primal formulation of semilinear elliptic problem are considered as applications. Other problems such as scalar elliptic problems with gradient nonlinearities (cf. [16, 18, 23, 26]) can be treated in a similar way. It is numerically validated that the multi-level algorithm has the quadratic convergence in the sense of (14).

In a future study, analysis of upwinding strategy will be incorporated to efficiently treat convection dominated features. It should be noted that our multi-level algorithm can be used on a sequence of adaptively refined meshes and a posteriori error estimators can be constructed for a class of nonlinear equations. The multi-level algorithm with adaptivity can be useful for nonlinear problems with singularity or layer.