1 Introduction

In this paper, let \(\Omega \subset R^d(d=2\ \text{ or }\ 3)\) be a convex polygonal/polyhedral domain (see Gunzburger et al. 1991, 2004). We consider the following steady incompressible MHD problem:

$$\begin{aligned} \left\{ \begin{array}{ll} -R_e^{-1}\Delta \mathbf {u}+\mathbf {u}\cdot \nabla \mathbf {u}+\nabla p - S_c \text{ curl } \mathbf {B}\times \mathbf {B}=\mathbf {f} &{}\quad \text{ in }\ \Omega ,\\ S_c R_m^{-1}\text{ curl }(\text{ curl } \mathbf {B})-S_c \text{ curl }(\mathbf {u}\times \mathbf {B})=\mathbf {g} &{}\quad \text{ in }\ \Omega ,\\ \nabla \cdot \mathbf {u}=0 &{}\quad \text{ in }\ \Omega ,\\ \nabla \cdot \mathbf {B}=0 &{}\quad \text{ in }\ \Omega ,\\ \end{array}\right. \end{aligned}$$
(1.1)

subject to the boundary conditions

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf {u}=0 &{}\quad \text{ on }\ \partial \Omega ,\\ \mathbf {B}\cdot \mathbf {n}=0 &{}\quad \text{ on }\ \partial \Omega ,\\ \mathbf {n}\times \text{ curl } \mathbf {B}=0 &{}\quad \text{ on }\ \partial \Omega ,\\ \end{array}\right. \end{aligned}$$
(1.2)

where \(\mathbf {u}\) is the velocity field, \(\mathbf {B}\) denotes the magnetic field, \(\mathbf {f}\) and \(\mathbf {g}\) are the source terms. \(\mathbf {n}\) is outward normal unit vector of \(\partial \Omega \), p is the hydrodynamic pressure, \(R_e,R_m\) and \(S_c\) are the hydrodynamic Reynolds number, magnetic Reynolds number and coupling number, respectively.

The steady incompressible MHD problem can be used to describe the interaction between a viscous, incompressible, electrically conducting field and an external magnetic field. Namely, the steady incompressible MHD problem is a coupled system, which is composed of Navier–Stokes equations of fluid dynamics and Maxwell’s equations that couple Lorentz’s force with Ohm’s law. We refer to Hughes and Young (1966) and Moreau et al. (1990) for comprehensive accounts of the physical background of MHD problem. Several papers have been devoted to the design and the analysis of numerical schemes for the MHD problem. For example, we can refer to Gunzburger et al. (1991, 2004) for the existence and uniqueness of the solutions, Discacciati (2008) for numerical approximation of the steady MHD problem, Hasler et al. (2004) and Schözau (2004) for the mixed finite element method (FEM), and Dong et al. (2014) and Tao and Zhang (2015) for the iterative method and so on.

The first main difficulty of solving the MHD problem is the nonlinear terms \(\mathbf {u}\cdot \nabla \mathbf {u}, \text{ curl } \mathbf {B}\times \mathbf {B}\) and \(\text{ curl }(\mathbf {u}\times \mathbf {B})\). Two-level method is an efficient numerical scheme for the nonlinear terms, and this method was pioneered by Marion and Xu (1995) and Xu (1996). The main idea of two level method is to find an initial approximation on a coarse mesh firstly, and then to solve a linear problem by using the coarse mesh solution on a fine mesh. It is a good strategy to decrease the computational cost. Therefore, two-level method has been wildly studied in recent years. For example, we can refer to Girault and Lions (2001), He (2003, 2004) and Zhang and Yang (2014) for the research of the Navier–Stokes equations, the nonlinear parabolic problem (Zhang 2013) and the natural convection problem (Zhang et al. 2015a, b). The other main difficulty is that the velocity and the pressure are coupled. Penalty method is a method to overcome this difficulty. Certainly, many researchers have focused on studying penalty method for solving different problems. For example, we can refer to Dai (2007) for the pure Neumann problem, and An and Shi (2015), Gunzburger (1989), He (2005) and Shen (1995) for the incompressible flow. From above mentioned literature, we know that the combination of two-level method and penalty method is quite efficient for solving the nonlinear system. Especially, from the numerical results of An and Shi (2015) and Qiu et al. (2014), we can see that two-level iterative FEM can save much CPU time than one-level iterative FEM with the same convergence order.

In this paper we consider the one-level and two-level iterative penalty FEMs to solve problem (1.1). The penalty parameter \(\varepsilon \ (0<\varepsilon \ll 1)\) is set as a real number. For any positive integer k, which is the number of iteration, the error estimates of the one-level iterative penalty FEM solution \(((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),p_{\varepsilon \mu }^k)\) are

$$\begin{aligned}&\displaystyle \Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1+\Vert p-p_{\varepsilon \mu }^k\Vert _0\le C(\mu +\varepsilon ^{k+1}),\\&\displaystyle \Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _0\le C(\mu ^2+\mu \varepsilon +\varepsilon ^{k+1}), \end{aligned}$$

and the error estimate of two-level iterative penalty FEM solution \(((\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h),p_\varepsilon ^h)\) is

$$\begin{aligned} |\Vert (\mathbf {u-u}_\varepsilon ^h,\mathbf {B-B}_\varepsilon ^h)|\Vert _1+\Vert p-p_\varepsilon ^h\Vert _0\le C (h+H^2+\varepsilon H+ \varepsilon ^{k+1}). \end{aligned}$$

Thus, if we choose \(\varepsilon =\mathcal {O}(H)=\mathcal {O}(h^{1/2})\), the one-level and two-level iterative penalty FEMs have the same order as the standard Galerkin FEM (see Dong et al. 2014). While from the point of view of numerical tests, we know that the two-level iterative penalty FEM can save a large amount of computational time than one-level iterative penalty FEM with the same order.

The paper is organized as follows: some notations and basic results of problem (1.1) are recalled in Sect. 2; stability and convergence of iterative penalty FEM are presented in Sect. 3; the stability and convergence of two-level iterative penalty FEM are analyzed in Sect. 4; and some numerical experiments are provided to validate the established theoretical analysis in Sect. 5. Finally, some conclusions are given in the last section.

2 Preliminaries

To gain the variational formulation for the steady incompressible MHD flow, we choose the standard Sobolev space \(H^j(\Omega )=W^{j,2}(\Omega )\) for any nonnegative integer j with norm \(\Vert v\Vert _j=(\sum _{|\gamma |=0}^j \Vert D^\gamma v\Vert _0^2)^{\frac{1}{2}}\). We use the standard Sobolev space \(\mathbf {H}^j(\Omega )=(H^j(\Omega ))^d\) with the corresponding norm \(\Vert \mathbf {v}\Vert _j=(\sum _{i=1}^d \Vert v_i\Vert _j^2)^{\frac{1}{2}}\) (see Adams 1975; Girault and Raviart 1986 for more details). Furthermore, we introduce some spaces as follows.

$$\begin{aligned} \mathbf {X}= & {} \mathbf {H}_0^1(\Omega )=\{\mathbf {v}\in \mathbf {H}^1(\Omega ):\mathbf {v}|_{\partial \Omega }=0 \}, \quad M=L_0^2(\Omega )=\left\{ q\in L^2(\Omega ):\int _{\Omega } q \mathrm{d}\mathbf {x}=0\right\} ,\\ \mathbf {W}= & {} \{\mathbf {w}\in \mathbf {H}^1(\Omega ):\mathbf {w}\cdot \mathbf {n}|_{\partial \Omega }=0\},\quad \mathbf {V}=\{\mathbf {v}\in \mathbf {X}:\nabla \cdot \mathbf {v}=0\ \ \text{ in }\ \Omega \},\\ \mathbf {V}_\mathbf {n}= & {} \{\mathbf {w}\in \mathbf {W}:\nabla \cdot \mathbf {w}=0\ \ \text{ in }\ \Omega \}. \end{aligned}$$

With the equivalent norms \(\Vert \nabla \mathbf {w}\Vert _0\) and \(\Vert \mathbf {w}\Vert _{\mathbf {H}_0^1(\Omega )}\) of \(\mathbf {X}\), we denote the product space \(\mathbf {W}_{0\mathbf {n}}=\mathbf {X\times W}\) equipped with the usual graph norm \(\mathbf {\Vert (w,\Phi )\Vert }_1,\forall \mathbf {(w,\Phi )\in W}_{0\mathbf {n}}\), where \(\mathbf {\Vert (w,\Phi )\Vert }_i=(\Vert \mathbf {w}\Vert _i^2+\Vert \mathbf {\Phi }\Vert _i^2)^{1/2} (i=0,1,2)\). The dual space of \(\mathbf {H}_0^1(\Omega )\) is denoted as \(\mathbf {H}^{-1}(\Omega )\) which equipped with the norm \(\Vert \cdot \Vert _{-1}\). In addition, the following two formulas

$$\begin{aligned} (\mathbf {(a\times b)\times c) \cdot d=(a\times b)\cdot (c\times d)=-(a\times b)\cdot (d\times c)}, \end{aligned}$$

and

$$\begin{aligned} \int _{\Omega }(\nabla \times \mathbf {\Phi })\cdot \mathbf {\Psi } d \mathbf {x} = -\int _{\partial \Omega }\mathbf {(\Phi \times n)\cdot \Psi }d\mathbf {s}+\int _{\Omega }\mathbf {\Phi \cdot (\nabla \times \Psi )}\mathrm{d}\mathbf {x}, \end{aligned}$$

imply that

$$\begin{aligned} (\text{ curl }\mathbf {(w\times \Phi ),\Psi })_\Omega= & {} -\langle \mathbf {(w\times \Phi )\times n,\Psi } \rangle |_{\partial \Omega }+(\mathbf {w \times \Phi } ,\text{ curl }\mathbf {\Psi })_\Omega \\= & {} (\mathbf {w\times \Phi },\text{ curl }\mathbf {\Psi })_\Omega =-(\text{ curl }\mathbf {\Psi \times \Phi ,w})_\Omega ,\quad \forall \mathbf {w\in X,\Phi ,\Psi \in W}, \end{aligned}$$

where \((\cdot ,\cdot )_\Omega \) stands for \(L^2\) inner product on the domain \(\Omega \). Define the trilinear term as follows:

$$\begin{aligned} a_1(\mathbf {u,w,v})= & {} \Bigg (\mathbf {u\cdot \nabla w}+\frac{1}{2}\mathbf {(\nabla \cdot u)w,v\Bigg )}_\Omega \nonumber \\= & {} \frac{1}{2}\mathbf {(u\cdot \nabla w,v)}_\Omega -\frac{1}{2}\mathbf {(u\cdot \nabla v, w)}_\Omega ,\quad \forall \mathbf {u,w,v\in X}. \end{aligned}$$
(2.1)

With above notations, for \(\mathbf {f}\in \mathbf {H}^1(\Omega ), \mathbf {g}\in L^2(\Omega )^d\), the weak variational formulation of the steady incompressible MHD problem (1.1) reads as: Find \((\mathbf {(u,B)},p)\in \mathbf {W}_{0\mathbf {n}}\times M\) such that

$$\begin{aligned}&A_0\mathbf {((u,B),(v,\Psi ))}+A_1\mathbf {((u,B),(u,B),(v,\Psi ))}-d_0(\mathbf {(v,\Psi )},p)+d_0(\mathbf {(u,B)},q)\nonumber \\&\quad =\mathbf {\langle F,(v,\Psi )\rangle },\quad \forall (\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}\times M, \end{aligned}$$
(2.2)

where

$$\begin{aligned} A_0\mathbf {((u,B),(v,\Psi ))}= & {} R_e^{-1} a_0\mathbf {(u,v)}+S_c R_m^{-1} b_0\mathbf {(B,\Psi )},\\ A_1\mathbf {((w,\Phi ),(u,B),(v,\Psi ))}= & {} a_1\mathbf {(w,u,v)}-c\mathbf {(B,\Phi ,v)}+c\mathbf {(\Psi ,\Phi ,u)},\\ d_0(\mathbf {(v,\Psi )},q)= & {} (\nabla \cdot \mathbf {v},q)_\Omega ,\quad a_0\mathbf {(u,v)}=\mathbf {(\nabla u,\nabla v)}_\Omega ,\\ b_0\mathbf {(B,\Psi )}= & {} \mathbf {(\nabla \times B,\nabla \times \Psi )}_\Omega +\mathbf {(\nabla \cdot B,\nabla \cdot \Psi )}_\Omega ,\\ c\mathbf {(B,\Phi ,v)}= & {} S_c(\text{ curl }\mathbf {B\times \Phi ,v})_\Omega ,\quad \mathbf {\langle F,(v,\Psi )\rangle }=\mathbf {\langle f,v\rangle }_\Omega +\mathbf {(g,\Psi )}_\Omega . \end{aligned}$$

Furthermore, we define

$$\begin{aligned} \Vert \mathbf {F}\Vert _{-1}=\sup _{\mathbf {(0,0)\ne (v,\Psi )\in W}_{0\mathbf {n}}}\frac{\mathbf {\langle F,(v,\Psi )\rangle }}{\mathbf {\Vert (v,\Psi )\Vert }_1}. \end{aligned}$$

The following properties of trilinear form \(a_1(\cdot ,\cdot ,\cdot )\) are useful to obtain the existence and uniqueness of a solution to problem (2.2) and gain the corresponding convergence (Adams 1975; Girault and Raviart 1986):

$$\begin{aligned}&\displaystyle a_1(\mathbf {u,v,w})=-a_1(\mathbf {u,v,w}),\quad \forall \mathbf {w,u,v}\in X,\end{aligned}$$
(2.3)
$$\begin{aligned}&\displaystyle |a_1(\mathbf {u,v,w})|\le C_0^2\mathbf {\Vert \nabla u\Vert }_0\mathbf {\Vert \nabla v\Vert }_0\mathbf {\Vert \nabla w\Vert }_0,\quad \forall \mathbf {w,u,v\in X},\end{aligned}$$
(2.4)
$$\begin{aligned}&\displaystyle |a_1(\mathbf {u,v,w})|\le \frac{N}{2}\Vert \mathbf {u}\Vert _0(\Vert \nabla \mathbf {v}\Vert _0\Vert \mathbf {w}\Vert _{\mathbf {L}^\infty } +\Vert \mathbf {v}\Vert _{\mathbf {L}^6}\Vert \nabla \mathbf {w}\Vert _{\mathbf {L}^3}),\nonumber \\&\displaystyle \quad \forall \mathbf {u\in L}^2(\Omega ),\mathbf {v\in X,w\in L}^\infty (\Omega )\cap \mathbf {X},\end{aligned}$$
(2.5)
$$\begin{aligned}&\displaystyle |a_1(\mathbf {u,v,w})|\le \frac{N}{2}(\Vert \mathbf {u}\Vert _{\mathbf {L}^\infty }\Vert \nabla \mathbf {v}\Vert _0 +\Vert \nabla \mathbf {u}\Vert _{\mathbf {L}^3}\Vert \mathbf {v}\Vert _{\mathbf {L}^6})\Vert \mathbf {w}\Vert _0,\nonumber \\&\displaystyle \quad \forall \mathbf {u\in L}^\infty (\Omega )\cap \mathbf {X},\mathbf {v\in X},\mathbf {w\in L}^2(\Omega ),\end{aligned}$$
(2.6)
$$\begin{aligned}&\displaystyle \Vert \mathbf {v}\Vert _0\le \gamma _0\Vert \nabla \mathbf {v}\Vert _0,\quad \Vert \mathbf {v}\Vert _{\mathbf {L}^3}\le C \Vert \mathbf {v}\Vert _0^{\frac{1}{2}}\Vert \nabla \mathbf {v}\Vert _0^{\frac{1}{2}},\quad \Vert \mathbf {v}\Vert _{\mathbf {L}^6}\le C\Vert \nabla \mathbf {v}\Vert _0,\quad \forall \mathbf {v\in X},\end{aligned}$$
(2.7)
$$\begin{aligned}&\displaystyle \Vert \mathbf {v}\Vert _{\mathbf {L}^\infty }\le C\Vert \mathbf {v}\Vert _1^{\frac{1}{2}}\Vert \mathbf {v}\Vert _2^{\frac{1}{2}},\quad \forall \mathbf {v\in H}^2(\Omega ), \end{aligned}$$
(2.8)

where \(N>0\) is a constant, \(\gamma _0\) (only dependent on \(\Omega \)) is a positive constant and \(C_0\) (only dependent on \(\Omega \)) is an embedding constant of \(\mathbf {H}^1(\Omega )\hookrightarrow \mathbf {L}^4(\Omega )\) (see Adams 1975) (\(\hookrightarrow \) denotes the continuous embedding), namely

$$\begin{aligned} \Vert \mathbf {w}\Vert _{\mathbf {L}^4}\le C_0\Vert \nabla \mathbf {w}\Vert _0, \quad \forall \mathbf {w\in X}. \end{aligned}$$

The trilinear form \(A_1(\cdot ,\cdot ,\cdot )\) is skew symmetric with respect to the later two variables, and it satisfies

$$\begin{aligned} A_1\mathbf {((w,\Phi ),(u,B),(u,B))}=0,\ \ \ \forall \mathbf {(w,\Phi ),(u,B)\in W}_{0\mathbf {n}}. \end{aligned}$$
(2.9)

To obtain the well-posedness of the problem (2.2), we list the coercivity and continuity of \(A_0(\cdot ,\cdot )\) and the continuity of \(A_1(\cdot ,\cdot ,\cdot )\) (see Gunzburger et al. 1991): for all \(\mathbf {(w,\Phi ),(u,B),(v,\Psi )\in W}_{0\mathbf {n}}\) such that

$$\begin{aligned} A_0\mathbf {((u,B),(v,\Psi ))}&\le \max \{R_e^{-1},(2+d)S_c R_m^{-1}\}\Vert \mathbf {(u,B)}\Vert _1\Vert \mathbf {(v,\Psi )}\Vert _1,\end{aligned}$$
(2.10)
$$\begin{aligned} A_0\mathbf {((u,B),(u,B))}&\ge \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert \mathbf {(u,B)}\Vert _1^2,\end{aligned}$$
(2.11)
$$\begin{aligned} A_1\mathbf {((w,\Phi ),(u,B),(v,\Psi ))}&\le \sqrt{2}C_0^2 \max \{1,\sqrt{2}S_c\}\Vert \mathbf {(w,\Phi )}\Vert _1\Vert \mathbf {(u,B)}\Vert _1\Vert \mathbf {(v,\Psi )}\Vert _1, \end{aligned}$$
(2.12)

where \(C_1\) (only dependent on \(\Omega \)) is the constant from the following inequality:

$$\begin{aligned} \Vert \nabla \times \mathbf {\Psi } \Vert _0^2+\Vert \nabla \cdot \mathbf {\Psi } \Vert _0^2 \ge C_1\Vert \mathbf {\Psi } \Vert _1^2,\quad \forall \mathbf {\Psi \in W}, \end{aligned}$$

\(\sqrt{2}\) and d come from two inequalities as follows:

$$\begin{aligned} \Vert \text{ curl }\mathbf {v}\Vert _0\le \sqrt{2}\Vert \nabla \mathbf {v}\Vert _0,\quad \Vert \nabla \cdot \mathbf {v}\Vert _0\le \sqrt{d}\Vert \nabla \mathbf {v}\Vert _0. \end{aligned}$$

where d is the dimension of the considered domain \(\Omega \).

Thanks to (2.3)–(2.8), the following properties of \(A_1(\cdot ,\cdot ,\cdot )\) hold (see Lemma 1 of Dong et al. 2014):

$$\begin{aligned}&|A_1\mathbf {((w,\Phi ),(u,B),(v,\Psi ))}|\le C\sqrt{2}C_0^2 \max \{1,\sqrt{2}S_c\}\Vert \mathbf {(w,\Phi )}\Vert _0\Vert \mathbf {(u,B)}\Vert _2\Vert \mathbf {(v,\Psi )}\Vert _1,\nonumber \\&\quad \forall (\mathbf {w,\Phi })\in \mathbf {L}^2(\Omega )\times \mathbf {L}^2(\Omega ),\quad (\mathbf {u,B})\in \mathbf {H}^2(\Omega )\times \mathbf {H}^2(\Omega ),(\mathbf {v,\Psi })\in \mathbf {W}_{0\mathbf {n}}, \end{aligned}$$
(2.13)
$$\begin{aligned}&|A_1\mathbf {((w,\Phi ),(u,B),(v,\Psi ))}|\le C\sqrt{2}C_0^2 \max \{1,\sqrt{2}S_c\}\Vert \mathbf {(w,\Phi )}\Vert _2\Vert \mathbf {(u,B)}\Vert _1\Vert \mathbf {(v,\Psi )}\Vert _0,\nonumber \\&\quad \forall (\mathbf {w,\Phi })\in \mathbf {H}^2(\Omega )\times \mathbf {H}^2(\Omega ), (\mathbf {u,B})\in \mathbf {W}_{0\mathbf {n}},(\mathbf {v,\Psi })\in \mathbf {L}^2(\Omega )\times \mathbf {L}^2(\Omega ). \end{aligned}$$
(2.14)

Throughout this paper, the letter \(C>0\) denotes different constant at different places, and C is independent of the mesh size \(\mu \) and penalty parameter \(\varepsilon \).

The bilinear form \(d_0(\cdot ,\cdot )\) is continuous on \(\mathbf {W}_{0\mathbf {n}}\times M\), and it satisfies (see Gunzburger et al. 1991):

$$\begin{aligned} \sup _{\mathbf {(v,\Psi )\in W}_{0\mathbf {n}}}\frac{|d_0(\mathbf {(v,\Psi )},q)|}{\Vert \mathbf {(v,\Psi )}\Vert _1}\ge \beta _0\Vert q\Vert _0,\quad \forall q\in M. \end{aligned}$$

Moreover, for all \(\mathbf {w\in H}^i(\Omega )\cap \mathbf {X},\ \mathbf {\Phi }\in \mathbf {H}^i(\Omega )\cap \mathbf {W}\ (i=0,1,2),\) we set

$$\begin{aligned} |\Vert (\mathbf {w,\Phi })|\Vert _i=\min \{R_e^{-1},S_c C_1 R_m^{-1}\}(\Vert \mathbf {w}\Vert _i^2+\Vert \mathbf {\Phi }\Vert _i^2)^{\frac{1}{2}}. \end{aligned}$$

We end this section by recalling the following important conclusions.

Theorem 2.1

(See Theorems 1 and 2 of Dong et al. 2014) Suppose that \(R_e, R_m, S_c\), and \(C_1\) satisfy

$$\begin{aligned} 0<\sigma =\frac{\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}\Vert \mathbf {F}\Vert _{-1}}{(\min \{R_e^{-1},S_c C_1 R_m^{-1}\})^2}<1, \end{aligned}$$
(2.15)

then problem (2.2) admits a unique solution \(((\mathbf {u,B}),p)\in \mathbf {W}_{0\mathbf {n}}\times M\). Moreover,

$$\begin{aligned} |\Vert \mathbf {(u,B)}|\Vert _1\le \Vert \mathbf {F}\Vert _{-1}. \end{aligned}$$
(2.16)

Theorem 2.2

(See Theorem 1 of Zhang et al. 2014) Set \(\Omega \) is a convex polygon/polyhedron and \(0<\sigma <1\), if \(\mathbf {f,g\in L}^2(\Omega )\), the solution \((\mathbf {(u,B)},p)\) of problem (2.2) satisfies

$$\begin{aligned} |\Vert \mathbf {(u,B)}|\Vert _2+\Vert p\Vert _1\le C\Vert \mathbf {F}\Vert _0. \end{aligned}$$
(2.17)

3 The stability and convergence of iterative penalty finite element method

3.1 Finite element spaces

Set \(\{\tau _\mu \}\) is a family of triangulations or tetrahedrons of \(\Omega \), and \(\tau _\mu \) is a shape-regular partition of \(\Omega \) with mesh size \(\mu \). The real parameter \(\mu >0\) takes h or \(H(h\ll H)\) tending to 0. The fine grid partition \(\tau _h\) is taken as a mesh refinement generated from the coarse grid \(\tau _H\). Based on the regular partitions \(\tau _h\) and \(\tau _H\), we can construct the conforming finite element spaces \((\mathbf {X}_h,M_h,\mathbf {W}_h)\) and \((\mathbf {X}_H,M_H,\mathbf {W}_H)\subset (\mathbf {X}_h,M_h,\mathbf {W}_h)\). Denote \(\mathbf {W}_{0\mathbf {n}}^\mu =\mathbf {X}_\mu \times \mathbf {W}_\mu \) and assume the finite element spaces \(\mathbf {X}_\mu ,\mathbf {W}_\mu \) and \(M_\mu \) satisfy the following assumptions.

Assumption A1 There are a mapping \(r_\mu \in \mathcal {L}(\mathbf {H}^2(\Omega )\cap \mathbf {V,X}_\mu )\) which satisfies

$$\begin{aligned} (\nabla \cdot (\mathbf {v}-r_\mu \mathbf {v}),q)=0,\quad \Vert \nabla (\mathbf {v}-r_\mu \mathbf {v})\Vert _0\le C\mu \Vert \mathbf {v}\Vert _2, \quad \forall \mathbf {v\in H}^2(\Omega )\cap \mathbf {V}, \forall q\in M_\mu , \end{aligned}$$

and an \(L^2\)-orthogonal projection operator \(\rho _\mu :M\rightarrow M_\mu \) which satisfies

$$\begin{aligned} \Vert q-\rho _\mu q\Vert _0\le C\mu \Vert q\Vert _1, \quad \forall q\in H^1(\Omega )\cap M, \end{aligned}$$

and a mapping \(R_\mu \in \mathcal {L}(\mathbf {H}^2(\Omega )\cap \mathbf {V}_\mathbf {n},\mathbf {W}_\mu )\) which satisfies

$$\begin{aligned} (\nabla \times R_\mu \mathbf {\Phi },\nabla \times \mathbf {\Psi })+(\nabla \cdot R_\mu \mathbf {\Phi },\nabla \cdot \mathbf {\Psi })= & {} (\nabla \times \mathbf {\Phi },\nabla \times \mathbf {\Psi })+(\nabla \cdot \mathbf {\Phi },\nabla \cdot \mathbf {\Psi })\\= & {} (\nabla \times \mathbf {\Phi },\nabla \times \mathbf {\Psi }),\quad \forall \mathbf {\Psi }\in \mathbf {W}_\mu ,\\ \Vert \mathbf {\Phi }-R_\mu \mathbf {\Phi }\Vert _0+\mu \Vert \mathbf {\Phi }-R_\mu \mathbf {\Phi }\Vert _1\le & {} C\mu ^2\Vert \mathbf {\Phi }\Vert _2,\quad \forall \mathbf {\Phi }\in \mathbf {H}^2(\Omega )\cap \mathbf {V}_\mathbf {n}. \end{aligned}$$

Assumption A2 Assume that the bilinear form \(d_0(\cdot ,\cdot )\) satisfies the discrete inf-sup condition, namely, there exists a positive constant \(\beta _0\) such that:

$$\begin{aligned} \sup _{\mathbf {(v,\Psi )\in W}_{0\mathbf {n}}^\mu }\frac{|d_0(\mathbf {(v,\Psi )},q)|}{\Vert \mathbf {(v,\Psi )}\Vert _1}\ge \beta _0\Vert q\Vert _0,\quad \forall q\in M_\mu . \end{aligned}$$

There are many finite element spaces satisfying Assumptions A1 and A2 with a convex polygonal or polyhedral domain \(\Omega \). In this paper we choose the stable finite element spaces that have been used traditionally for the Navier–Stokes equations to approximate velocity and pressure. Here, the mini-element is chosen to approximate the velocity and pressure, and those finite element spaces as follows:

$$\begin{aligned} \mathbf {X}_\mu =(P_{1,\mu }^b)^d\cap \mathbf {X},\quad M_\mu =\{q_\mu \in C^0(\Omega ):q_\mu |_K\in P_1(K),\quad \forall K\in \tau _\mu \}, \end{aligned}$$

where

$$\begin{aligned} P_{1,\mu }^b=\{v_\mu \in C^0(\Omega ):v_\mu |_K\in P_1(K)\oplus \text{ span }\{\hat{b}\},\quad \forall K\in \tau _\mu \}, \end{aligned}$$

\(P_1(K)\) is defined as the space of polynomials of degree (the degree \(\le 1\) on K), and \(\hat{b}\) is a bubble function. For the magnetic field approximation space \(\mathbf {W}_\mu \), there is unrestricted. For the sake of convenience, we choose the same finite element space for the magnetic field space as the one for velocity field, i.e., we use \(\mathbf {W}_\mu =(P_{1,\mu }^b)^d \cap \mathbf {W}\) to approximate the magnetic field.

Now we define the discrete form of the divergence-free space \(\mathbf {V}\) as:

$$\begin{aligned} \mathbf {V}_\mu =\{\mathbf {v}\in \mathbf {X}_\mu :d_0((\mathbf {v,\Psi }),q)=0, \forall q\in M_\mu , \forall \mathbf {\Psi \in W}_\mu \}. \end{aligned}$$

Introduce two \(L^2\)-orthogonal projectors \(P_\mu :\mathbf {L}^2(\Omega )\rightarrow \mathbf {V}_\mu \) and \(R_{0\mu }:\mathbf {L}^2(\Omega )\rightarrow \mathbf {W}_\mu \). Define the discrete Stokes operator \(A_{1\mu }=-P_\mu \Delta _\mu \), where \(\Delta _\mu \) is defined by (see Sermane and Temam 1983)

$$\begin{aligned} -(\Delta _\mu \mathbf {u}_\mu ,\mathbf {v}_\mu )=(\nabla \mathbf {u}_\mu ,\nabla \mathbf {v}_\mu ),\quad \forall \mathbf {u}_\mu , \mathbf {v}_\mu \in \mathbf {X}_\mu , \end{aligned}$$

and its corresponding discrete norm is \(\Vert \mathbf {v}_\mu \Vert _{j,\mu }=\Vert A_{1\mu }^{\frac{j}{2}}\mathbf {v}_\mu \Vert _0\) with the order \(j\in R\), in which

$$\begin{aligned} \Vert \mathbf {v}_\mu \Vert _{1,\mu }=\Vert \nabla \mathbf {v}_\mu \Vert _0,\quad \Vert \mathbf {v}_\mu \Vert _{2,\mu }=\Vert A_{1\mu }\mathbf {v}_\mu \Vert _0,\quad \forall \mathbf {v}_\mu \in \mathbf {V}_\mu . \end{aligned}$$

Similarly, define the discrete operator \(A_{2\mu }\mathbf {B}_\mu =R_{0\mu }(\nabla _\mu \times \nabla \times \mathbf {B}_{\mu }+\nabla _\mu \nabla \cdot \mathbf {B}_{\mu })\in \mathbf {W}_\mu \) as follows (see He 2015; Sermane and Temam 1983)

$$\begin{aligned} (A_{2\mu }\mathbf {B}_\mu ,\mathbf {\Psi })=(A_{2\mu }^{\frac{1}{2}}\mathbf {B}_\mu ,A_{2\mu }^{\frac{1}{2}}\mathbf {\Psi })= (\nabla \times \mathbf {B}_\mu ,\nabla \times \mathbf {\Psi }) +(\nabla \cdot \mathbf {B}_\mu ,\nabla \cdot \mathbf {\Psi }),\quad \forall \mathbf {B}_\mu ,\mathbf {\Psi }\in \mathbf {W}_\mu , \end{aligned}$$

and its corresponding discrete norm is \(\Vert \mathbf {B}_\mu \Vert _{j,\mu }=\Vert A_{2\mu }^{\frac{j}{2}}\mathbf {B}_\mu \Vert _0\) with the order \(j\in R\), in which

$$\begin{aligned} \Vert \mathbf {B}_\mu \Vert _{1,\mu }^2= & {} \Vert A_{2\mu }^{\frac{1}{2}}\mathbf {B}_\mu \Vert _0^2=\Vert \nabla \times \mathbf {B}_\mu \Vert _0^2+\Vert \nabla \cdot \mathbf {B}_\mu \Vert _0^2,\\ \Vert \mathbf {B}_\mu \Vert _{2,\mu }= & {} \Vert \nabla _\mu \times \nabla \times \mathbf {B}_\mu +\nabla _\mu \nabla \cdot \mathbf {B}_\mu \Vert _0. \end{aligned}$$

Moreover, we also introduce some discrete estimates as follows (see Adams 1975; He 2003, 2015)

$$\begin{aligned} \Vert \nabla \mathbf {v}_\mu \Vert _{\mathbf {L}^3}+\Vert \mathbf {v}_\mu \Vert _{\mathbf {L}^\infty }\le C \Vert \nabla \mathbf {v}_\mu \Vert _0^{\frac{1}{2}}\Vert A_{1\mu }\mathbf {v}_\mu \Vert _0^{\frac{1}{2}}, \quad \Vert \nabla \mathbf {v}_\mu \Vert _{\mathbf {L}^6}\le C\Vert A_{1\mu }\mathbf {v}_\mu \Vert _0,\quad \forall \mathbf {v}_\mu \in \mathbf {V}_\mu . \end{aligned}$$

The Galerkin FEM for problem (2.2) reads as: find \(((\mathbf {u}_\mu ,\mathbf {B}_\mu ),p_\mu )\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \) such that

$$\begin{aligned}&A_0((\mathbf {u}_\mu ,\mathbf {B}_\mu ),(\mathbf {v},\mathbf {\Psi }))+A_1((\mathbf {u}_\mu ,\mathbf {B}_\mu ),(\mathbf {u}_\mu ,\mathbf {B}_\mu ),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p_\mu )\nonumber \\&\quad +\,d_0((\mathbf {u}_\mu ,\mathbf {B}_\mu ),q)=\mathbf {\langle F,(v,\Psi )\rangle },\quad \forall (\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu . \end{aligned}$$
(3.1)

Using the similar argument to Theorem 2.1, we can obtain the following conclusions (see Theorems 3 and 4 of Dong et al. 2014).

Theorem 3.1

Under the condition of (2.15) and Assumption A1 , the discrete problem (3.1) admits a unique solution \(((\mathbf {u}_\mu ,\mathbf {B}_\mu ),p_\mu )\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \), which satisfies

$$\begin{aligned} |\Vert (\mathbf {u}_\mu ,\mathbf {B}_\mu )|\Vert _1 \le \Vert \mathbf {F}\Vert _{-1}. \end{aligned}$$
(3.2)

Theorem 3.2

Under the Assumptions A1 and A2 and the condition of (2.15), the solutions of problem (3.1) satisfy

$$\begin{aligned} |\Vert (A_{1\mu }\mathbf {u}_\mu ,A_{2\mu }\mathbf {B}_\mu )|\Vert _0\le C\Vert \mathbf {F}\Vert _0. \end{aligned}$$
(3.3)

Furthermore, it holds

$$\begin{aligned} \Vert (\mathbf {u-u}_\mu ,\mathbf {B-B}_\mu )\Vert _0+\mu (|\Vert (\mathbf {u-u}_\mu ,\mathbf {B-B}_\mu )|\Vert _1+\Vert p-p_\mu \Vert _0)\le C \mu ^2\Vert \mathbf {F}\Vert _0. \end{aligned}$$
(3.4)

3.2 Penalty finite element method

The penalty FEM for problem (2.2) is as follows: find \(((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),p_{\varepsilon \mu })\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \) such that for all \((\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \)

$$\begin{aligned}&A_0((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),(\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p_{\varepsilon \mu })\nonumber \\&\quad +\,d_0((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),q)+\varepsilon (p_{\varepsilon \mu },q)=\mathbf {\langle F,(v,\Psi )\rangle }, \end{aligned}$$
(3.5)

where \(0<\varepsilon \ll 1\) is a penalty parameter. This is the standard penalty FEM for problem (2.2). Now we present the stability and convergence of the standard penalty FEM.

Theorem 3.3

Under the condition of (2.15) and the Assumption A1, the discrete problem (3.5) admits a unique solution \(((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),p_{\varepsilon \mu })\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \), which satisfies

$$\begin{aligned} |\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })|\Vert _1^2 +2\varepsilon \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_{\varepsilon \mu }\Vert _0^2\le \Vert \mathbf {F}\Vert _{-1}^2. \end{aligned}$$
(3.6)

Furthermore, we have

$$\begin{aligned} \Vert p_{\varepsilon \mu }\Vert _0\le C\Vert \mathbf {F}\Vert _{-1}. \end{aligned}$$

Proof

Choosing \((\mathbf {v},\mathbf {\Psi })=(\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\) and \(q=p_{\varepsilon \mu }\) in (3.5), using (2.11) and (2.9) to get

$$\begin{aligned}&\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1^2+\varepsilon \Vert p_{\varepsilon \mu }\Vert _0^2\le \Vert \mathbf {F}\Vert _{-1}\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1\\&\quad \le \frac{1}{2}\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1^2+\frac{1}{2}(\min \{R_e^{-1},S_c C_1 R_m^{-1}\})^{-1}\Vert \mathbf {F}\Vert _{-1}^2, \end{aligned}$$

thus

$$\begin{aligned} \min \{R_e^{-1},&S_c C_1 R_m^{-1}\}\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1^2+2\varepsilon \Vert p_{\varepsilon \mu }\Vert _0^2 \le (\min \{R_e^{-1},S_c C_1 R_m^{-1}\})^{-1}\Vert \mathbf {F}\Vert _{-1}^2.\nonumber \\ \end{aligned}$$
(3.7)

On the other hand, taking \(q=0\) in (3.5), applying (2.10) and (2.12) to obtain

$$\begin{aligned} \beta _0\Vert p_{\varepsilon \mu }\Vert _0\le & {} \max \{R_e^{-1},(2+d)S_c R_m^{-1}\}\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1\\&+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1^2+\Vert \mathbf {F}\Vert _{-1}. \end{aligned}$$

With the help of (3.7), we have

$$\begin{aligned} \beta _0\Vert p_{\varepsilon \mu }\Vert _0\le & {} \left[ \frac{\max \{R_e^{-1},(2+d)S_c R_m^{-1}\}}{\min \{R_e^{-1},S_c C_1 R_m^{-1}\}} +\frac{\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}\Vert \mathbf {F}\Vert _{-1}}{(\min \{R_e^{-1},S_c C_1 R_m^{-1}\})^2}+1\right] \Vert \mathbf {F}\Vert _{-1}\\\le & {} \left[ \frac{\max \{R_e^{-1},(2+d)S_c R_m^{-1}\}}{\min \{R_e^{-1},S_c C_1 R_m^{-1}\}}+\sigma +1\right] \Vert \mathbf {F}\Vert _{-1} \le C \Vert \mathbf {F}\Vert _{-1}, \end{aligned}$$

Thus, the proof is completed.

Theorem 3.4

Let \(\Omega \) be a convex polygonal/polyhedral domain. Under the Assumptions A1, A2 and (2.15), the solution of problem (3.5) satisfies

$$\begin{aligned} |\Vert (\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu })|\Vert _1+\Vert p-p_{\varepsilon \mu }\Vert _0\le C(\mu +\varepsilon ). \end{aligned}$$

Proof

Subtracting (3.5) from (2.2), we obtain the following error equation

$$\begin{aligned}&A_0((\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu }),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),(\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu }),(\mathbf {v},\mathbf {\Psi }))\nonumber \\&\quad +\,A_1((\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu }),(\mathbf {u},\mathbf {B}),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p-p_{\varepsilon \mu })\nonumber \\&\quad +\,d_0((\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu }),q)-\varepsilon (p_{\varepsilon \mu },q)=0. \end{aligned}$$
(3.8)

Taking \((\mathbf {v},\mathbf {\Psi })=(r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\) and \(q=\rho _\mu p-p_{\varepsilon \mu }\) in (3.8), using (2.9) we have

$$\begin{aligned}&A_0((r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }),(r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })) +\varepsilon (\rho _\mu p-p_{\varepsilon \mu },\rho _\mu p-p_{\varepsilon \mu })\nonumber \\&\quad \quad +\,A_1((r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }),(\mathbf {u},\mathbf {B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }))\nonumber \\&\quad =A_0((r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }))+\varepsilon (\rho _\mu p,\rho _\mu p-p_{\varepsilon \mu })\nonumber \\&\quad \quad +\,A_1((r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}),(\mathbf {u},\mathbf {B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }))\nonumber \\&\quad \quad +\,A_1((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),(r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}), (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }))\nonumber \\&\quad \quad +\,d_0((r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }),p-p_{\varepsilon \mu }) -d_0((\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu }),\rho _\mu p-p_{\varepsilon \mu }).\nonumber \\ \end{aligned}$$
(3.9)

Due to the Assumption A1, we get

$$\begin{aligned} A_0((r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })) =R_e^{-1}a_0(r_\mu \mathbf {u-u},r_\mu \mathbf {u-u}_{\varepsilon \mu }), \end{aligned}$$
(3.10)

and

$$\begin{aligned}&d_0((r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }),p-p_{\varepsilon \mu }) -d_0((\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu }),\rho _\mu p-p_{\varepsilon \mu })\nonumber \\&\quad =d_0((r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu }),p-\rho _\mu p). \end{aligned}$$
(3.11)

Using (2.11) and (2.12) to obtain

$$\begin{aligned}&(\min \{R_e^{-1},S_c C_1 R_m^{-1}\}-\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}\Vert (\mathbf {u},\mathbf {B})\Vert _1) \Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\Vert _1^2\nonumber \\&\quad \quad +\,\varepsilon \Vert \rho _\mu p-p_{\varepsilon \mu }\Vert _0^2\nonumber \\&\quad \le (\{R_e^{-1}+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}(\Vert (\mathbf {u},\mathbf {B})\Vert _1+\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1)\} \Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1\nonumber \\&\quad \quad +\,\sqrt{d}\Vert p-\rho _\mu p\Vert _0 )\nonumber \\&\quad \quad \times \, \Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\Vert _1+\varepsilon \Vert \rho _\mu p\Vert _0(\Vert p-\rho _\mu p\Vert _0+\Vert p-p_{\varepsilon \mu }\Vert _0). \end{aligned}$$
(3.12)

Choosing \(q=0\) in (3.8), applying (2.10), (2.12) and Assumption A2, one finds

$$\begin{aligned}&\beta _0\Vert p-p_{\varepsilon \mu }\Vert _0\le ((\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}(\Vert (\mathbf {u},\mathbf {B})\Vert _1\nonumber \\&\quad \quad +\,\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1))\Vert (\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu })\Vert _1\nonumber \\&\quad \le (\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}(\Vert (\mathbf {u},\mathbf {B})\Vert _1\nonumber \\&\quad \quad +\,\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1)) (\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\Vert _1+\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1).\nonumber \\ \end{aligned}$$
(3.13)

Substituting (3.13) into (3.12), with the conditions of Theorem 2.1 and (3.6), we obtain

$$\begin{aligned}&(1-\sigma )\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\Vert _1^2 +\varepsilon \Vert \rho _\mu p-p_{\varepsilon \mu }\Vert _0^2\nonumber \\&\quad \le ((R_e^{-1}+2\sigma \min \{R_e^{-1},S_c C_1 R_m^{-1}\})\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1+\sqrt{d}\Vert p-\rho _\mu p\Vert _0 \nonumber \\&\quad \quad +\,\varepsilon \Vert p\Vert _0\beta _0^{-1}(\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}\nonumber \\&\qquad +2\sigma \min \{R_e^{-1},S_c C_1 R_m^{-1}\}) ) \Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\Vert _1\nonumber \\&\quad \quad +\,\varepsilon \Vert p\Vert _0(\Vert p-\rho _\mu p\Vert _0+\beta _0^{-1}(\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}\nonumber \\&\quad \quad +\,2\sigma \min \{R_e^{-1},S_c C_1 R_m^{-1}\}) \Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1)\nonumber \\&\quad \le \frac{1}{2}(1-\sigma )\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\Vert _1^2\nonumber \\&\quad \quad +\,C(\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1+\Vert p-\rho _\mu p\Vert _0+\varepsilon )^2+C\varepsilon (\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}\Vert _1\nonumber \\&\qquad +\,\Vert p-\rho _\mu p\Vert _0)\nonumber \\&\quad \le \frac{1}{2}(1-\sigma )\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })\Vert _1^2\nonumber \\&\quad \quad +\,C(\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1+\Vert p-\rho _\mu p\Vert _0+\varepsilon )^2. \end{aligned}$$
(3.14)

In virtue of the Assumption A1 we have

$$\begin{aligned}&|\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })|\Vert _1\nonumber \\&\quad \le C(\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1 +\Vert p-\rho _\mu p\Vert _0+\varepsilon )\le C(\mu +\varepsilon ). \end{aligned}$$
(3.15)

Applying the triangle inequality to gain

$$\begin{aligned}&|\Vert (\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu })|\Vert _1\nonumber \\&\quad \le |\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu },R_\mu \mathbf {B-B}_{\varepsilon \mu })|\Vert _1 +|\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})|\Vert _1\le C(\mu +\varepsilon ).\nonumber \\ \end{aligned}$$
(3.16)

Combining (3.13) with (3.16), the error \(\Vert p-p_{\varepsilon \mu }\Vert _0\) can be bounded by

$$\begin{aligned} \Vert p-p_{\varepsilon \mu }\Vert _0\le C|\Vert (\mathbf {u-u}_{\varepsilon \mu },\mathbf {B-B}_{\varepsilon \mu })|\Vert _1\le C(\mu +\varepsilon ). \end{aligned}$$

The proof of Theorem 3.4 is completed.

Next, we consider the relationship between \(((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),p_{\varepsilon \mu })\) and \(((\mathbf {u}_\mu ,\mathbf {B}_\mu ),p_\mu )\) as \(\varepsilon \rightarrow 0\).

Lemma 3.5

Let \(\Omega \) be a convex polygonal/polyhedral domain. Under the Assumptions A1, A2 and (2.15), the solution \(((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),p_{\varepsilon \mu })\) of problem (3.5) converges the solution \(((\mathbf {u}_\mu ,\mathbf {B}_\mu ),p_\mu )\) of problem (3.1) as \(\varepsilon \rightarrow 0\).

Proof

Subtracting (3.5) from (3.1), we obtain the following error equation

$$\begin{aligned}&A_0((\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu }),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu }),(\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu }),(\mathbf {v},\mathbf {\Psi }))\nonumber \\&\quad +\,A_1((\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu }),(\mathbf {u}_\mu ,\mathbf {B}_\mu ),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p_\mu -p_{\varepsilon \mu })\nonumber \\&\quad +\,d_0((\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu }),q)-\varepsilon (p_{\varepsilon \mu },q)=0. \end{aligned}$$
(3.17)

Taking \((\mathbf {v},\mathbf {\Psi })=(\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu })\) and \(q= p_\mu -p_{\varepsilon \mu }\) in (3.17), using (2.9) we have

$$\begin{aligned}&A_0((\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu }),(\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu }))\\&\quad +A_1((\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu }),(\mathbf {u}_\mu ,\mathbf {B}_\mu ),(\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu })) -\varepsilon (p_{\varepsilon \mu },q)=0. \end{aligned}$$

Using (2.11) and (2.12) to obtain

$$\begin{aligned}&(\min \{R_e^{-1},S_c C_1 R_m^{-1}\}-\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}\Vert (\mathbf {u}_\mu ,\mathbf {B}_\mu )\Vert _1) \Vert (\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu })\Vert _1^2\nonumber \\&\quad \le \varepsilon \Vert p_{\varepsilon \mu }\Vert _0 \Vert p_\mu -p_{\varepsilon \mu }\Vert _0. \end{aligned}$$
(3.18)

Here, \(\min \{R_e^{-1},S_c C_1 R_m^{-1}\}-\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}\Vert (\mathbf {u}_\mu ,\mathbf {B}_\mu )\Vert _1\ge \min \{R_e^{-1},S_c C_1 R_m^{-1}\}(1-\sigma )>0\). Choosing \(q=0\) in (3.17), using (2.10), (2.12) and the Assumption A2, one finds

$$\begin{aligned} \beta _0\Vert p_\mu -p_{\varepsilon \mu }\Vert _0\le & {} ((\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}(\Vert (\mathbf {u}_\mu ,\mathbf {B}_\mu )\Vert _1\nonumber \\&+\Vert (\mathbf {u}_{\varepsilon \mu },\mathbf {B}_{\varepsilon \mu })\Vert _1) )\Vert (\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu })\Vert _1. \end{aligned}$$
(3.19)

Substituting (3.18) into (3.19), using (3.2) and (3.6) to obtain

$$\begin{aligned} \beta _0\Vert p_\mu -p_{\varepsilon \mu }\Vert _0\le C (\varepsilon \Vert p_{\varepsilon \mu }\Vert _0 \Vert p_\mu -p_{\varepsilon \mu }\Vert _0)^{1/2}, \end{aligned}$$

thus

$$\begin{aligned} \Vert p_\mu -p_{\varepsilon \mu }\Vert _0\le C \varepsilon \Vert p_{\varepsilon \mu }\Vert _0\le C \varepsilon ^{1/2}. \end{aligned}$$
(3.20)

Then, from (3.20) we know that \(\Vert p_\mu -p_{\varepsilon \mu }\Vert _0\rightarrow 0\) as \(\varepsilon \rightarrow 0\).

Substituting (3.20) into (3.18) to obtain

$$\begin{aligned} \Vert (\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu })\Vert _1^2 \le C \varepsilon \Vert p_{\varepsilon \mu }\Vert _0 \cdot \varepsilon ^{1/2} \le C \varepsilon . \end{aligned}$$
(3.21)

From (3.21) we know that \(\Vert (\mathbf {u}_\mu -\mathbf {u}_{\varepsilon \mu },\mathbf {B}_\mu -\mathbf {B}_{\varepsilon \mu })\Vert _1\rightarrow 0\) as \(\varepsilon \rightarrow 0\). Thus the proof is finished.

3.3 Iterative penalty finite element method

The one-level iterative penalty FEM for problem (2.2) reads as:

Step 1 Find \(((\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0),p_{\varepsilon \mu }^0)\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \) such that for all \((\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \)

$$\begin{aligned}&A_0((\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0),(\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p_{\varepsilon \mu }^0)\nonumber \\&\quad +\,d_0((\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0),q)+\varepsilon (p_{\varepsilon \mu }^0,q)=\mathbf {\langle F,(v,\Psi )\rangle }. \end{aligned}$$
(3.22)

Step 2 For \(k=1,2,3,\ldots \), find \(((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),p_{\varepsilon \mu }^k)\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \) such that for all \(\forall (\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \)

$$\begin{aligned}&A_0((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),(\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p_{\varepsilon \mu }^k)\nonumber \\&\quad +\,d_0((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),q)+\varepsilon (p_{\varepsilon \mu }^k,q)=\mathbf {\langle F,(v,\Psi )\rangle }+\varepsilon (p_{\varepsilon \mu }^{k-1},q). \end{aligned}$$
(3.23)

From above scheme, we can see that the initial value \(((\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0),p_{\varepsilon \mu }^0)\) of the one-level iterative penalty FEM is gained by Step 1. From Theorems 3.3 and 3.4, we obtain the following conclusion.

Theorem 3.6

Under the conditions of Theorem 3.4, the solution \(((\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0),p_{\varepsilon \mu }^0)\) of the problem (3.22) is unique and satisfies

$$\begin{aligned} |\Vert (\mathbf {u}_{\varepsilon \mu }^0,\mathbf {B}_{\varepsilon \mu }^0)|\Vert _1^2 +2\varepsilon \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_{\varepsilon \mu }^0\Vert _0^2\le \Vert \mathbf {F}\Vert _{-1}^2. \end{aligned}$$
(3.24)

Furthermore, it holds

$$\begin{aligned} |\Vert (\mathbf {u-u}_{\varepsilon \mu }^0,\mathbf {B-B}_{\varepsilon \mu }^0)|\Vert _1+\Vert p-p_{\varepsilon \mu }^0\Vert _0\le C(\mu +\varepsilon ). \end{aligned}$$
(3.25)

Now we study the stability of one-level iterative penalty FEM solution of (3.23).

Theorem 3.7

Under the conditions of Theorem 3.3, suppose that \(((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),p_{\varepsilon \mu }^k)\in \mathbf {W}_{0\mathbf {n}}^\mu \times M_\mu \) is the solution of the discrete problem (3.23), then the solution satisfies

$$\begin{aligned} |\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)|\Vert _1^2 +\varepsilon \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_{\varepsilon \mu }^k\Vert _0^2\le (k+1)\Vert \mathbf {F}\Vert _{-1}^2. \end{aligned}$$
(3.26)

Proof

Taking \((\mathbf {v},\mathbf {\Psi })=(\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)\) and \(q=p_{\varepsilon \mu }^k\) in (3.23), using (2.9), (2.11) and (2.12) to find

$$\begin{aligned}&\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)\Vert _1^2\!+\!\varepsilon \Vert p_{\varepsilon \mu }^k\Vert _0^2\!\le \! \Vert \mathbf {F}\Vert _{-1}\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)\Vert _1\!+\!\varepsilon \Vert p_{\varepsilon \mu }^{k-1}\Vert _0\Vert p_{\varepsilon \mu }^k\Vert _0\\&\quad \le \frac{1}{2}\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)\Vert _1^2+\frac{1}{2}(\min \{R_e^{-1},S_c C_1 R_m^{-1}\})^{-1}\Vert \mathbf {F}\Vert _{-1}^2\\&\quad \quad +\frac{1}{2}\varepsilon \Vert p_{\varepsilon \mu }^{k-1}\Vert _0^2+\frac{1}{2}\varepsilon \Vert p_{\varepsilon \mu }^k\Vert _0^2. \end{aligned}$$

Thanks to (3.24), it yields

$$\begin{aligned}&|\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)|\Vert _1^2+\varepsilon \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_{\varepsilon \mu }^k\Vert _0^2\\&\quad \le \Vert \mathbf {F}\Vert _{-1}^2+\varepsilon \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_{\varepsilon \mu }^{k-1}\Vert _0^2\\&\quad \le k\Vert \mathbf {F}\Vert _{-1}^2+\varepsilon \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_{\varepsilon \mu }^{0}\Vert _0^2\le (k+1)\Vert \mathbf {F}\Vert _{-1}^2, \end{aligned}$$

which implies (3.26). The proof of Theorem 3.7 is completed.

Next, we present the convergence of one-level iterative penalty FEM.

Theorem 3.8

Under the conditions of Theorem 3.4, the solution of problem (3.23) satisfies

$$\begin{aligned} |\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1+\Vert p-p_{\varepsilon \mu }^k\Vert _0\le C(\mu +\varepsilon ^{k+1}). \end{aligned}$$
(3.27)

Proof

From Theorem 3.6, we know if \(k=0\) (3.27) holds. Then we assume that (3.27) holds for \(k-1\).

From (3.23) and (2.2), we obtain

$$\begin{aligned}&A_0((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),(\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {v},\mathbf {\Psi }))\nonumber \\&\quad +\,A_1((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {u},\mathbf {B}),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p-p_{\varepsilon \mu }^k)\nonumber \\&\quad +\,d_0((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),q)-\varepsilon (p_{\varepsilon \mu }^k,q)+\varepsilon (p_{\varepsilon \mu }^{k-1},q)=0. \end{aligned}$$
(3.28)

Taking \((\mathbf {v},\mathbf {\Psi })=(r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\) and \(q=\rho _\mu p-p_{\varepsilon \mu }^k\) in (3.28), using (2.9) to get

$$\begin{aligned}&A_0((r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k),(r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k))\\&\quad \quad +A_1((r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {u},\mathbf {B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k))\\&\quad \quad +\,\varepsilon (\rho _\mu p-p_{\varepsilon \mu }^k,\rho _\mu p-p_{\varepsilon \mu }^k)\\&\quad =A_0((r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k))\\&\quad \quad +\,A_1((r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}),(\mathbf {u},\mathbf {B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k))\\&\quad \quad +\,A_1((\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k),(r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}), (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k))\\&\quad \quad +\,d_0((r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k),p-p_{\varepsilon \mu }^k) -d_0((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),\rho _\mu p-p_{\varepsilon \mu }^k)\\&\quad \quad +\,\varepsilon (\rho _\mu p-p_{\varepsilon \mu }^{k-1},\rho _\mu p-p_{\varepsilon \mu }^k). \end{aligned}$$

With the Assumption A1, we obtain

$$\begin{aligned} A_0((r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}),(r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)) =R_e^{-1}a_0(r_\mu \mathbf {u-u},r_\mu \mathbf {u-u}_{\varepsilon \mu }^k), \end{aligned}$$

and

$$\begin{aligned}&d_0((r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k),p-p_{\varepsilon \mu }^k) -d_0((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),\rho _\mu p-p_{\varepsilon \mu }^k)\\&\quad =d_0((r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k),p-\rho _\mu p). \end{aligned}$$

Using (2.11) and (2.12), one finds

$$\begin{aligned}&(\min \{R_e^{-1},S_c C_1 R_m^{-1}\}-\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}\Vert (\mathbf {u},\mathbf {B})\Vert _1) \Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1^2\nonumber \\&\quad \quad +\,\varepsilon \Vert \rho _\mu p-p_{\varepsilon \mu }^k\Vert _0^2\nonumber \\&\quad \le (\{R_e^{-1}+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}(\Vert (\mathbf {u},\mathbf {B})\Vert _1+\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)\Vert _1)\} \Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1\nonumber \\&\quad \quad +\,\sqrt{d}\Vert p-\rho _\mu p\Vert _0 )\nonumber \\&\quad \quad \times \,\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1+\varepsilon (\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0)\nonumber \\&\quad \quad \times \,(\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k}\Vert _0). \end{aligned}$$
(3.29)

Choosing \(q=0\) in (3.28) and combining (2.10), (2.12) and the Assumption A2 to get

$$\begin{aligned}&\beta _0 \Vert p-p_{\varepsilon \mu }^k\Vert _0\le ((\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}(\Vert (\mathbf {u},\mathbf {B})\Vert _1\nonumber \\&\quad \quad +\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)\Vert _1))\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1\nonumber \\&\quad \le (\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}+\sqrt{2}C_0^2\max \{1,\sqrt{2}S_c\}(\Vert (\mathbf {u},\mathbf {B})\Vert _1\nonumber \\&\quad \quad +\Vert (\mathbf {u}_{\varepsilon \mu }^k,\mathbf {B}_{\varepsilon \mu }^k)\Vert _1)) (\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1+\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1).\nonumber \\ \end{aligned}$$
(3.30)

Substituting (3.30) into (3.29), and using (2.16) and (3.26) to gain

$$\begin{aligned}&(1-\sigma )\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1^2 +\varepsilon \Vert \rho _\mu p-p_{\varepsilon \mu }^k\Vert _0^2\\&\quad \le ((R_e^{-1}+2\sigma \min \{R_e^{-1},S_c C_1 R_m^{-1}\})\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1+\sqrt{d}\Vert p-\rho _\mu p\Vert _0\\&\quad \quad +\,\varepsilon (\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0)\beta _0^{-1}(\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}\\&\qquad +C \min \{R_e^{-1},S_c C_1 R_m^{-1}\}) )\\&\quad \quad \times \Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1+\varepsilon (\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0)(\Vert p-\rho _\mu p\Vert _0\\&\quad \quad +\,\beta _0^{-1}(\max \{R_e^{-1},S_c(2+d) R_m^{-1}\}+C \min \{R_e^{-1},S_c C_1 R_m^{-1}\}) \Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1)\\&\quad \le \frac{1}{2}(1-\sigma )\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1^2\\&\quad \quad +\,C(\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1+\Vert p-\rho _\mu p\Vert _0+\varepsilon (\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0))^2\\&\quad \quad +\,C\varepsilon (\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0)(\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B}\Vert _1+\Vert p-\rho _\mu p\Vert _0)\\&\quad \le \frac{1}{2}(1-\sigma )\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1^2\\&\quad \quad +\,C(\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1+\Vert p-\rho _\mu p\Vert _0+\varepsilon (\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0))^2. \end{aligned}$$

Using the Assumption A1 to get

$$\begin{aligned} \Vert \Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1\le & {} C(\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})\Vert _1 +\Vert p-\rho _\mu p\Vert _0\\&+\varepsilon (\Vert \rho _\mu p-p\Vert _0+\Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0))\le C(\mu +\varepsilon ^{k+1}). \end{aligned}$$

From the triangle inequality we gain

$$\begin{aligned} |\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1\le & {} |\Vert (r_\mu \mathbf {u-u}_{\varepsilon \mu }^k,R_\mu \mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1 +|\Vert (r_\mu \mathbf {u-u},R_\mu \mathbf {B-B})|\Vert _1\nonumber \\\le & {} C(\mu +\varepsilon ^{k+1}). \end{aligned}$$
(3.31)

Combining (3.30) with (3.31), the error estimate \(\Vert p-p_{\varepsilon \mu }^k\Vert _0\) can be bounded by

$$\begin{aligned} \Vert p-p_{\varepsilon \mu }^k\Vert _0\le C|\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1\le C(\mu +\varepsilon ^{k+1}). \end{aligned}$$
(3.32)

Thus, the proof of Theorem 3.8 is completed.

Next, we present the \(\mathbf {L}^2\) error estimate \(\Vert (\mathbf {u-u}_{\varepsilon \mu }^{k},\mathbf {B-B}_{\varepsilon \mu }^{k})\Vert _0\). To achieve this aim, we consider the following dual problem: find \((\mathbf {(w,\Phi )},s)\in \mathbf {W}_{0\mathbf {n}}\times M\) such that

$$\begin{aligned}&A_0(\mathbf {(v,\Psi )},\mathbf {(w,\Phi )})+A_1(\mathbf {(u,B)},\mathbf {(v,\Psi )},\mathbf {(w,\Phi )}) +A_1(\mathbf {(v,\Psi )},\mathbf {(u,B)},\mathbf {(w,\Phi )})\nonumber \\&\quad \quad -\,d_0(\mathbf {(v,\Psi )},s)+d_0(\mathbf {(w,\Phi )},q)\nonumber \\&\quad =((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),\mathbf {(v,\Psi )}),\quad \forall (\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}\times M.\nonumber \\ \end{aligned}$$
(3.33)

If the solution of problem (3.33) satisfies \(\mathbf {w}\in \mathbf {H}^2(\Omega )\cap \mathbf {X},\ \mathbf {\Phi }\in \mathbf {H}^2(\Omega )\cap \mathbf {W}\), then we have (see Gunzburger et al. 1991)

$$\begin{aligned} \Vert \mathbf {(w,\Phi )}\Vert _2+\Vert s\Vert _1\le C \Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _0. \end{aligned}$$
(3.34)

Theorem 3.9

Under the conditions of Theorem 3.4, the solution of problem (3.23) satisfies

$$\begin{aligned} \Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _0\le C(\mu ^2+\mu \varepsilon +\varepsilon ^{k+1}). \end{aligned}$$
(3.35)

Proof

Choosing \(\mathbf {(v,\Psi )}=(r_\mu \mathbf {w},R_\mu \mathbf {\Phi })\) and \(q=-\rho _\mu s\) in (3.28), subtracting it from (3.33) with \(\mathbf {(v,\Psi )}=(\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\) and \(q=p_{\varepsilon \mu }^k-p\), we obtain

$$\begin{aligned}&\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _0^2=A_0((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi }))\\&\quad +\,A_1((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),\mathbf {(u,B)},(\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi }))\\&\quad +\,A_1(\mathbf {(u,B)},(\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi }))\\&\quad -A_1((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi }))\\&\quad +\,A_1((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),(\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),\mathbf {(w,\Phi )})\\&\quad -d_0((\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi }),p-p_{\varepsilon \mu }^k) -d_0((\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k),s-\rho _\mu s)\\&\quad -\varepsilon (p_{\varepsilon \mu }^k,\rho _\mu s)+\varepsilon (p_{\varepsilon \mu }^{k-1},\rho _\mu s). \end{aligned}$$

Applying the conditions of Theorem 2.1, (2.10), (2.12), (3.6) and (3.34), one finds

$$\begin{aligned}&\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _0^2\\&\quad \le \text{ max }\{R_e^{-1},(2+d)S_c R_m^{-1}\}\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1\Vert (\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi })\Vert _1\\&\quad \quad +\,2\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert \mathbf {(u,B)}\Vert _1\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1\Vert (\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi })\Vert _1\\&\quad \quad +\,\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1^2(\Vert (\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }\!-\!R_\mu \mathbf {\Phi })\Vert _1 \!+\!\Vert (\mathbf {w,\Phi })\Vert _1)\\&\quad \quad +\,\sqrt{d}\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _1\Vert s-\rho _\mu s\Vert _0+\sqrt{d}\Vert (\mathbf {w}-r_\mu \mathbf {w},\mathbf {\Phi }-R_\mu \mathbf {\Phi })\Vert _1\Vert p-p_{\varepsilon \mu }^k\Vert _0\\&\quad \le C(\mu |\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1 +|\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1^2)\Vert (\mathbf {w,\Phi })\Vert _2\\&\quad \quad +\,C\mu (|\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1\Vert s\Vert _1 +\Vert (\mathbf {w,\Phi })\Vert _2\Vert p-p_{\varepsilon \mu }^k\Vert _0)\\&\quad \quad +\,\varepsilon \Vert p-p_{\varepsilon \mu }^k\Vert _0\Vert s\Vert _0+\varepsilon \Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0\Vert s\Vert _0\\&\quad \le C(\mu (|\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1 +\Vert p-p_{\varepsilon \mu }^k\Vert _0)+|\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1^2\\&\quad \quad +\,\varepsilon \Vert p-p_{\varepsilon \mu }^k\Vert _0+\varepsilon \Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0 )\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _0. \end{aligned}$$

Thanks to the Theorem 3.8, we have

$$\begin{aligned}&\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)\Vert _0\le C\left( \mu (|\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1+\Vert p-p_{\varepsilon \mu }^k\Vert _0)\right. \\&\quad \quad \left. +|\Vert (\mathbf {u-u}_{\varepsilon \mu }^k,\mathbf {B-B}_{\varepsilon \mu }^k)|\Vert _1^2+\varepsilon \Vert p-p_{\varepsilon \mu }^k\Vert _0+\varepsilon \Vert p-p_{\varepsilon \mu }^{k-1}\Vert _0\right) \\&\quad \le C(\mu (\mu +\varepsilon ^{k+1})+(\mu +\varepsilon ^{k+1})^2+\varepsilon (\mu +\varepsilon ^k))\le C(\mu ^2+\mu \varepsilon +\varepsilon ^{k+1}). \end{aligned}$$

As a consequence, the desired result is obtained.

4 Two-level iterative penalty finite element method

In this section, we consider the stability and convergence of two-level iterative penalty FEM for the stationary incompressible MHD problem.

The two-level iterative penalty FEM based on Stokes iteration can be described as follows.

Step 1 Find \(((\mathbf {u}_{\varepsilon H}^0,\mathbf {B}_{\varepsilon H}^0),p_{\varepsilon H}^0)\in \mathbf {W}_{0\mathbf {n}}^H\times M_H\) such that for all \((\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}^H \times M_H\)

$$\begin{aligned}&A_0((\mathbf {u}_{\varepsilon H}^0,\mathbf {B}_{\varepsilon H}^0),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon H}^0,\mathbf {B}_{\varepsilon H}^0),(\mathbf {u}_{\varepsilon H}^0,\mathbf {B}_{\varepsilon H}^0),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p_{\varepsilon H}^0)\nonumber \\&\quad +\,d_0((\mathbf {u}_{\varepsilon H}^0,\mathbf {B}_{\varepsilon H}^0),q)+\varepsilon (p_{\varepsilon H}^0,q)=\mathbf {\langle F,(v,\Psi )\rangle }. \end{aligned}$$
(4.1)

Step 2 For \(n=1,2,3,\ldots ,k\), find \(((\mathbf {u}_{\varepsilon H}^n,\mathbf {B}_{\varepsilon H}^n),p_{\varepsilon H}^n)\in \mathbf {W}_{0\mathbf {n}}^H\times M_H\) such that for all \((\mathbf {(v,\Psi )},q)\in \mathbf {W}_{0\mathbf {n}}^H \times M_H\)

$$\begin{aligned}&A_0((\mathbf {u}_{\varepsilon H}^n,\mathbf {B}_{\varepsilon H}^n),(\mathbf {v},\mathbf {\Psi })) +A_1((\mathbf {u}_{\varepsilon H}^n,\mathbf {B}_{\varepsilon H}^n),(\mathbf {u}_{\varepsilon H}^n,\mathbf {B}_{\varepsilon H}^n),(\mathbf {v},\mathbf {\Psi })) -d_0((\mathbf {v},\mathbf {\Psi }),p_{\varepsilon H}^n)\nonumber \\&\quad +\,d_0((\mathbf {u}_{\varepsilon H}^n,\mathbf {B}_{\varepsilon H}^n),q)+\varepsilon (p_{\varepsilon H}^n,q)=\mathbf {\langle F,(v,\Psi )\rangle }+\varepsilon (p_{\varepsilon H}^{n-1},q). \end{aligned}$$
(4.2)

In step 3, we solve a Stokes iterative MHD problem on fine mesh.

Step 3 Find \(((\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h),p_\varepsilon ^h)\in \mathbf {W}_{0\mathbf {n}}^h \times M_h\) such that for any \(((\mathbf {v,\Psi }),q)\in \mathbf {W}_{0\mathbf {n}}^h \times M_h\)

$$\begin{aligned}&A_0((\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h),(\mathbf {v,\Psi })) +A_1((\mathbf {u}_{\varepsilon H}^k,\mathbf {B}_{\varepsilon H}^k),(\mathbf {u}_{\varepsilon H}^k,\mathbf {B}_{\varepsilon H}^k),(\mathbf {v,\Psi })) -d_0((\mathbf {v,\Psi }),p_\varepsilon ^h)\nonumber \\&\quad +\,d_0((\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h),q)+\varepsilon (p_\varepsilon ^h,q)=\mathbf {\langle F,(v,\Psi )\rangle } +\varepsilon (p_{\varepsilon H}^k,q). \end{aligned}$$
(4.3)

Remark 4.1

In our two-level iterative penalty FEM, we adopt the Stokes iteration to treat the nonlinear terms, other iterative schemes, such as the Newton iteration and Oseen iteration, can also be used to treat the nonlinear terms. Here, we omit the analysis of these iterative schemes due to the similar proofs.

Now we present the stability of the two-level iterative penalty FEM.

Theorem 4.2

Under the conditions of Theorem 3.3, the solution \(((\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h),p_\varepsilon ^h)\) defined by scheme (4.3) satisfies

$$\begin{aligned} |\Vert (\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h)|\Vert _1+\varepsilon \min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_\varepsilon ^h\Vert _0 \le (k^2+5k+5)\Vert \mathbf {F}\Vert _{-1}^2. \end{aligned}$$
(4.4)

where k is the number of iterative step.

Proof

Choosing \((\mathbf {v,\Psi })=(\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h)\) and \(q=p_\varepsilon ^h\) in (4.3), and using (2.11), (2.12), (2.15) and (3.26), we gain

$$\begin{aligned}&\text{ min }\{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h)\Vert _1^2+\varepsilon \Vert p_\varepsilon ^h\Vert _0^2 \\&\quad \le (\sqrt{2}C_0^2 \text{ max }\{1,\sqrt{2}S_c\} \Vert (\mathbf {u}_{\varepsilon H}^k,\mathbf {B}_{\varepsilon H}^k)\Vert _1^2+\Vert \mathbf {F}\Vert _{-1})\Vert (\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h)\Vert _1 +\varepsilon \Vert p_\varepsilon ^h\Vert _0\Vert p_{\varepsilon H}^k\Vert _0\\&\quad \le (1+\sigma (k+1))\Vert \mathbf {F}\Vert _{-1}\Vert (\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h)\Vert _1 +\varepsilon \Vert p_\varepsilon ^h\Vert _0\Vert p_{\varepsilon H}^k\Vert _0\\&\quad \le \frac{1}{2}\text{ min }\{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h)\Vert _1^2 +\frac{(k+2)^2}{2}(\text{ min }\{R_e^{-1},S_c C_1 R_m^{-1}\})^{-1}\Vert \mathbf {F}\Vert _{-1}^2\\&\quad \quad +\,\frac{1}{2}\varepsilon \Vert p_\varepsilon ^h\Vert _0^2+\frac{1}{2}\varepsilon \Vert p_{\varepsilon H}^k\Vert _0^2.\\ \end{aligned}$$

As a consequence one finds

$$\begin{aligned}&|\Vert (\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h)|\Vert _1^2+\varepsilon \text{ min }\{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_\varepsilon ^h\Vert _0^2\\&\quad \le (k+2)^2\Vert \mathbf {F}\Vert _{-1}^2+\varepsilon \text{ min }\{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert p_{\varepsilon H}^k\Vert _0^2\le (k^2+5k+5)\Vert \mathbf {F}\Vert _{-1}^2. \end{aligned}$$

Then the proof is completed.

Theorem 4.3

Under the conditions of Theorem 3.6, the solution \(((\mathbf {u}_\varepsilon ^h,\mathbf {B}_\varepsilon ^h),p_\varepsilon ^h)\) of two-level iterative penalty FEM defined by scheme (4.3) satisfies

$$\begin{aligned} |\Vert (\mathbf {u-u}_\varepsilon ^h,\mathbf {B-B}_\varepsilon ^h)|\Vert _1+\Vert p-p_\varepsilon ^h\Vert _0\le C(h+H^2+H\varepsilon +\varepsilon ^{k+1}). \end{aligned}$$
(4.5)

Proof

Subtracting (4.3) from (2.2), we have

$$\begin{aligned}&A_0((\mathbf {u-u}_\varepsilon ^h,\mathbf {B-B}_\varepsilon ^h),(\mathbf {v,\Psi }))+A_1((\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k),(\mathbf {u},\mathbf {B}),(\mathbf {v,\Psi }))\nonumber \\&\quad +\,A_1((\mathbf {u},\mathbf {B}),(\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k),(\mathbf {v,\Psi }))-A_1((\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k),\nonumber \\&(\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k),(\mathbf {v,\Psi }))-d_0((\mathbf {v,\Psi }),p-p_\varepsilon ^h)+d_0((\mathbf {u-u}_\varepsilon ^h,\mathbf {B-B}_\varepsilon ^h),q)\nonumber \\&\quad -\varepsilon (p_\varepsilon ^h,q)+\varepsilon (p_{\varepsilon H}^k,q)=0. \end{aligned}$$
(4.6)

Taking \((\mathbf {v,\Psi })=(r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\) and \(q=\rho _h p-p_\varepsilon ^h\) in (4.6), using (2.11),(2.12), (2.13) and (2.14) to gain

$$\begin{aligned}&\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\Vert _1^2+\varepsilon \Vert \rho _h p-p_\varepsilon ^h\Vert _0^2\nonumber \\&\quad \le (R_e^{-1}\Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1 +C\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u},\mathbf {B})\Vert _2\nonumber \\&\qquad \times \Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _0+\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2\nonumber \\&\quad \quad +\,\sqrt{d}\Vert \rho _h p-p\Vert _0) \Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\Vert _1\nonumber \\&\quad \quad +\,\varepsilon (\Vert \rho _h p-p\Vert _0+\Vert p-p_{\varepsilon H}^k\Vert _0)(\Vert \rho _h p-p\Vert _0+\Vert p-p_{\varepsilon }^h\Vert _0). \end{aligned}$$
(4.7)

Taking \(q=0\) in (4.6), thanks to (2.10), (2.12), (2.13), (2.14) and Assumption A2, one finds

$$\begin{aligned} \beta _0\Vert p-p_\varepsilon ^h\Vert _0\le & {} C\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u},\mathbf {B})\Vert _2 \Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon }^k)\Vert _0\nonumber \\&+\text{ max }\{R_e^{-1},(2+d)S_cR_m^{-1}\}(\Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1\nonumber \\&+\Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\Vert _1)\nonumber \\&+\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2. \end{aligned}$$
(4.8)

Substituting (4.8) into (4.7) and applying (2.17) we obtain

$$\begin{aligned}&\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\Vert _1^2+\varepsilon \Vert \rho _h p-p_\varepsilon ^h\Vert _0^2\\&\quad \le (R_e^{-1}\Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1 +C\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u},\mathbf {B})\Vert _2\\&\qquad \Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _0\\&\quad \quad \left. +\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2+\sqrt{d}\Vert \rho _h p-p\Vert _0\right. \\&\quad \quad +\,\varepsilon (\Vert \rho _h p-p\Vert _0+\Vert p-p_{\varepsilon H}^k\Vert _0)\beta _0^{-1}\text{ max }\{R_e^{-1},(2+d)S_cR_m^{-1}\})\\&\qquad \Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\Vert _1\\&\quad \quad +\,\varepsilon (\Vert \rho _h p-p\Vert _0\!+\!\Vert p-p_{\varepsilon H}^k\Vert _0)(\Vert \rho _h p-p\Vert _0\!+\!\beta _0^{-1}(C\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u},\mathbf {B})\Vert _2 \\&\quad \quad \left. \cdot \Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _0+\text{ max }\{R_e^{-1},(2+d)S_cR_m^{-1}\}\Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1\right. \\&\quad \quad +\,\sqrt{2}C_0^2\text{ max }\{1,\sqrt{2}S_c\}\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2))\\&\quad \le \frac{1}{2}\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\Vert _1^2 +C\left( \Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1\right. \\&\quad \quad \left. +\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _0+\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2+\Vert \rho _h p-p\Vert _0\right. \\&\quad \quad +\,\varepsilon (\Vert \rho _h p-p\Vert _0+\Vert p-p_{\varepsilon H}^k\Vert _0))^2+C\varepsilon (\Vert \rho _h p-p\Vert _0+\Vert p-p_{\varepsilon H}^k\Vert _0)(\Vert \rho _h p-p\Vert _0\\&\quad \quad +\,\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _0+\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2+\Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1)\\&\quad \le \frac{1}{2}\min \{R_e^{-1},S_c C_1 R_m^{-1}\}\Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)\Vert _1^2 +C\left( \Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1\right. \\&\quad \quad +\,\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _0+\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2 +\Vert \rho _h p-p\Vert _0\\&\qquad +\varepsilon (\Vert \rho _h p-p\Vert _0+\Vert p-p_{\varepsilon H}^k\Vert _0))^2. \end{aligned}$$

Using the Theorems 3.8 and 3.9 to obtain

$$\begin{aligned}&|\Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)|\Vert _1\le C(\Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})\Vert _1 +\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _0\\&\quad \quad +\,\Vert (\mathbf {u-u}_{\varepsilon H}^k,\mathbf {B-B}_{\varepsilon H}^k)\Vert _1^2+\Vert \rho _h p-p\Vert _0+\varepsilon (\Vert \rho _h p-p\Vert _0+\Vert p-p_{\varepsilon H}^k\Vert _0))\\&\quad \le C(h+(H^2+H\varepsilon +\varepsilon ^{k+1})+(H+\varepsilon ^{k+1})^2+\varepsilon (h+H+\varepsilon ^{k+1}))\\&\quad \le C(h+H^2+H\varepsilon +\varepsilon ^{k+1}). \end{aligned}$$

By the triangle inequality and the Assumption A1, it holds

$$\begin{aligned} |\Vert (\mathbf {u-u}_\varepsilon ^h,\mathbf {B-B}_\varepsilon ^h)|\Vert _1\le & {} C(|\Vert (r_h\mathbf {u-u}_\varepsilon ^h,R_h\mathbf {B-B}_\varepsilon ^h)|\Vert _1+|\Vert (r_h\mathbf {u-u},R_h\mathbf {B-B})|\Vert _1). \nonumber \\\le & {} C(h+H^2+H\varepsilon +\varepsilon ^{k+1}). \end{aligned}$$
(4.9)

From (4.8) and (4.9), we have

$$\begin{aligned} \Vert p-p_\varepsilon ^h\Vert _0\le C(h+H^2+H\varepsilon +\varepsilon ^{k+1}). \end{aligned}$$
(4.10)

We finish the proof by combining (4.9) with (4.10).

Remark 4.4

If we take \(\varepsilon =\mathcal {O}(H)\) and \(h=\mathcal {O}(H^2)\) for the two-level iterative penalty FEM, we can get the same order of convergence rate as the standard Galerkin FEM, namely, it holds

$$\begin{aligned} |\Vert (\mathbf {u-u}_\varepsilon ^h,\mathbf {B-B}_\varepsilon ^h)|\Vert _1+\Vert p-p_\varepsilon ^h\Vert _0\le C h. \end{aligned}$$

5 Numerical analysis

In this section, we present some numerical results of one-level and two-level iterative penalty FEMs for incompressible MHD equations. The software FreeFEm++ is used in this numerical experiments (see Hecht et al. 2015). The UMFPACK routine is applied to solve the linear systems arising from the discrete algebraic equations. The mesh consists of triangular elements that are obtained by dividing \(\Omega \) into subsquares of equal size and then drawing the diagonal in each sub-square. The \((P_1b,P_1,P_1b)\) finite element pair is used and the iterative tolerance \(10^{-5}\) is adopted in all numerical tests.

Table 1 One-level iterative penalty FEM for incompressible MHD problem
Table 2 Two-level iterative penalty FEM for incompressible MHD problem with \(H=h^{1/2}\)

The example is quoted from Tao and Zhang (2015). The steady incompressible MHD equations are defined on a convex domain \(\Omega =[0,1]^2\). The boundary and initial conditions and right-hand side functions \(\mathbf{{f}}\) and \(\mathbf{{g}}\) are selected such that the exact solutions are given by

$$\begin{aligned} u_1= & {} x^2(x-1)^2y(y-1)(2y-1);\quad u_2=-y^2(y-1)^2x(x-1)(2x-1);\\ p= & {} (2x-1)(2y-1);\quad B_1=\sin (\pi x)\cos (\pi y);\quad B_2=-\sin (\pi y)\cos (\pi x); \end{aligned}$$

where the components of \(\mathbf{{u}}\) and \(\mathbf{{B}}\) are denoted by \((u_1,u_2)\) and \((B_1,B_2)\) for convenience. Firstly, we choose the parameters \(R_e=R_m=S_c=1\) and \(\varepsilon =0.001\). In all numerical tests, we use several mesh pairs \(1/h=9,16,25,36,49,64,81,100\) and \(H=h^{\frac{1}{2}}\). Comparison of relative errors with different iterations are shown in Tables 1 and 2 for one-level and two-level iterative penalty FEMs respectively. Then we show the relative errors between the exact solution and the numerical solutions obtained from one-level and two-level iterative penalty FEMs in Tables 3 and 4. As observed from Tables 3 and 4, the errors \(\frac{\Vert \mathbf {u-u}_h\Vert _1}{\Vert \mathbf {u}\Vert _1}\), \(\frac{\Vert \mathbf {B-B}_h\Vert _1}{\Vert \mathbf {B}\Vert _1}\), \(\frac{\Vert \mathbf {u-u}_h\Vert _0}{\Vert \mathbf {u}\Vert _0}\) and \(\frac{\Vert \mathbf {B-B}_h\Vert _0}{\Vert \mathbf {B}\Vert _0}\) become smaller and smaller as the mesh is refined. In all tables, the symbol “Iteration” denotes the number of iteration in Step 2 of corresponding method. From these tables, the observations and conclusions are obtained as follows:

Table 3 One-level iterative penalty FEM for incompressible MHD problem
Table 4 Two-level iterative penalty FEM for incompressible MHD problem
  • Based on Tables 1 and 2, the errors of the velocity and magnetic in \(\mathbf {H}^1\)- and \(\mathbf {L}^2\)-norms become smaller as the iteration increasing in both one-level and two-level iterative penalty methods. Especially, when \(k=2\) the results is as good as k takes 3, 4, and 5. Thus we choose the iteration \(k=2\) in following numerical tests.

  • From Table 3, we can see that the optimal numerical convergence orders of one-level iterative penalty FEM are agreed with the ones predicted by the theoretical analysis in Theorems 3.8 and 3.9, namely, \(\mathcal {O}(h)\) for velocity and magnetic in \(\mathbf {H}^1\)-norm and pressure in \(L^2\)-norm, and \(\mathcal {O}(h^2)\) for velocity and magnetic in \(\mathbf {L}^2\)-norm.

  • From Table 4, two-level iterative penalty FEM can achieve the optimal numerical convergence orders of \(\mathcal {O}(h)\) for velocity and magnetic in \(\mathbf {H}^1\)-norm and pressure in \(L^2\)-norm, as proven in Theorem 4.3. Furthermore, we can find that two-level iterative penalty FEM can reach the optimal orders of \(\mathcal {O}(h^2)\) for velocity and magnetic in \(\mathbf {L}^2\)-norm.

  • By comparing the Tables 3 and 4, we can see that two-level iterative penalty FEM significantly takes the least CPU time than the one-level iterative penalty FEM with the same approximation results.

6 Conclusion

In this paper, we present the theoretical analysis of the one-level and two-level iterative penalty FEMs for the steady incompressible MHD problem. The stability and error estimates of these numerical methods are obtained. Numerical experiments are made to show that the one-level and two-level iterative penalty FEMs are valid for solving the incompressible MHD problem, and the numerical results are consistent with the theoretical analysis. Moreover, in our further works we will consider the extensions of the Stokes iteration on fine mesh to other linearization methods, such as the Oseen and Newton iterations, combining the present methods with some stabilization techniques likes subgrid method or variational multiscale method, and solving large Reynolds number MHD problem.