1 Introduction

This work is concerned with the finite element approximation of the power-law Stokes flow governed by the partial differential equations

$$\begin{aligned} -\nu \,\mathrm{div }(|{\varvec{D}}({\varvec{u}})|^{r-2}{\varvec{D}}({\varvec{u}}))+\nabla p= & {} {\varvec{f}}\quad \text {in}\quad \Omega , \nonumber \\ \mathrm{div }{\varvec{u}}= & {} 0\quad \text {in}\quad \Omega ,\nonumber \\ {\varvec{u}}= & {} 0\quad \text {on}\quad \partial \Omega , \end{aligned}$$
(1.1)

where \(\Omega \) is the flow region, a bounded domain in \(\mathbb {R}^d\) with \(d=2,3\), and \(\partial \Omega \) its boundary. The motion of our incompressible fluid is described by the velocity \({\varvec{u}}({\varvec{x}})\) and pressure \(p({\varvec{x}})\). The external force per unit volume is \({\varvec{f}}\), while the positive parameter \(\nu \) is the viscosity of the fluid. Of course, the symmetric part of the velocity gradient given is

$$\begin{aligned} {\varvec{D}}({\varvec{u}})=\frac{1}{2}(\nabla {\varvec{u}}+(\nabla {\varvec{u}})^{{\varvec{\small T}}}). \end{aligned}$$

\(|\cdot |\) denotes the Euclidean vector norm (that is \(|{\varvec{u}}|^2={\varvec{u}}\cdot {\varvec{u}}\)) for a vector, whereas for a tensor of order two, it is a Frobenius norm. For convenience, we also assume that \(r>1\). It is immediate to see that (1.1) is reduced to the classical Stokes equations for \(r=2\), a case which is well documented in the literature (see [13] for a good description and analysis of a multitude of approximations). However, to the best of our knowledge, there is very little documentations devoted to (1.1) even though one of the first work dedicated to such problem goes back to the seventies [4] were the first error analysis has been proposed and numerical simulations. Notable contributions have been made in [59] for the analysis of the convergence of the finite element discretzation. Due to the severe nonlinearity, combine with the pressure and the incompressibility condition it is clear that the numerical analysis of (1.1) is complicated. (1.1) can be regarded as a mathematical model of non-Newtonian flow (see [10, 11]). But the r-Laplacian may also be used in many other situations; in chemical engineering (see Boger and co-workers [12]), design of extrusion of dies, (Liu and co-workers [13]), the study of lithosphere, (England [14]), and other geophysical applications, (Sonder [15]), and in meteorology (see [16]).

The mixed finite element method has been the framework used by many researchers to study the power law Stokes flow. Among these studies, one can singled out the works of Lefton and Wei [17], Borggaard, Iliescu and Roop [18], Barrett and Liu [8], Baranger and Najib [5, 6], just to mention a few. In [5, 6], convergence of the finite element solution is obtained for two dimensional fluid with \(r\in (1,2)\), while Barrett and Liu [8] have generalized that result by considering \(r>1\). No numerical experiments are exhibited in these works. It should be mentioned that the results obtained in [5, 6, 8] are based on the mixed formulation derived from (1.1), while the theory of Babuska-Brezzi is the the main tool for obtaining existence and uniqueness of the solutions. In this work, instead we will “eliminate” the constraint \(\mathrm{div }{\varvec{u}}=0\) by introducing a penalty parameter \(\epsilon \) that we will set to zero later on. The reasons for adopting this approach are well known, among them are; its computational efficiency, the simplification it brings to the problem, its dependence on solid mathematical foundation. The reader can check among many references [1, 1725] to see the power of the method.

Lefton and Wei in [17, 19] considered the following nonlinear penalty method

$$\begin{aligned} -\nu \mathrm{div }(|{\varvec{D}}({\varvec{u}})^\epsilon |^{r-2}{\varvec{D}}({\varvec{u}}^\epsilon ))+\nabla p^\epsilon= & {} {\varvec{f}}~~\text {in}~~\Omega ,\nonumber \\ |\mathrm{div }{\varvec{u}}^\epsilon |^{r-2}\,\mathrm{div }{\varvec{u}}^\epsilon +\epsilon p^\epsilon= & {} 0~~\text {in}~~\Omega ,\nonumber \\ {\varvec{u}}^\epsilon= & {} 0~~\text {on}~~\partial \Omega , \end{aligned}$$
(1.2)

where \(0<\epsilon<<1\), and showed the following

$$\begin{aligned} \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert _{1,r} \le C\left\{ \begin{array}{ll} \epsilon ^{\frac{1}{(r-1)^2}}&{}\quad \text {if}\quad r\ge 2 \\ \epsilon ^{\frac{1}{(r-1)(3-r)}}&{}\quad \text {if}\quad 1<r\le 2 \end{array}\right. , \end{aligned}$$
(1.3)

where C is a generic positive constant depending here on the size of \({\varvec{f}}\) and the domain \(\Omega \). Moreover, they have obtained the following convergence result for \(r> 1\)

$$\begin{aligned} \Vert {\varvec{u}}-{\varvec{u}}^\epsilon _h\Vert _{1,r}=O(h^\alpha \epsilon ^{-\beta })~, \end{aligned}$$
(1.4)

where \(\alpha ,\beta \) are positive numbers, and \({\varvec{u}}^\epsilon _h\) is the finite element approximation of \({\varvec{u}}^\epsilon \). Later on in a work supported by numerical experiments, Borggaard, Iliescu and Roop [18] have obtained similar result with better convergence rate. To our knowledge, these are the first results in the literature dealing with penalized Stokes power-law finite element approximations. It is manifest that these convergence results are not valid for all \(\epsilon \), and therefore violate the essence of adopting penalized method. In this work, we consider instead the following penalized equations

$$\begin{aligned} -\nu \mathrm{div }(|{\varvec{D}}({\varvec{u}}^\epsilon )|^{r-2}{\varvec{D}}({\varvec{u}}^\epsilon ))+\nabla p^\epsilon= & {} {\varvec{f}}~~\text {in}~~\Omega ,\nonumber \\ \mathrm{div }{\varvec{u}}^\epsilon +\epsilon p^\epsilon= & {} 0~~\text {in}~~\Omega ,\nonumber \\ {\varvec{u}}^\epsilon= & {} 0~~\text {on}~~\partial \Omega . \end{aligned}$$
(1.5)

One of our aims is to show uniform convergence with respect to \(\epsilon \) of the penalized finite element formulation. It should be mentioned that Borggaard, Iliescu and Roop [18] have analyzed (1.5) for the case \(r\ge 2\) and derived a convergence in the form of (1.4) previously obtained in [17]. Our convergence result differs in the sense that we obtain an error estimate of the form O\((h^\alpha )\)+O\((\epsilon ^\beta )\) where \(\alpha \) and \(\beta \) are two positive constants. It is clear that our result is valid for all \(\epsilon \). Numerical simulations also confirm this (see particulary Table 3). We obtain the a priori error estimate without any extra assumptions, though the rate of convergence is far from optimal. The key difference between our analysis and those of [17, 18] is our careful application of Babuska-Brezzi’s theory for mixed variational problem. The second contribution of this work is the numerical resolution of (1.5). Indeed since we are dealing with a nonlinear equations, an iterative process such as Newton’s method or fixed point can be used (see (1.2)), but instead we adopt a strategy pioneered by R. Glowinski (see [26, 27]) consisting of the following steps after the finite element formulation of (1.5):

Step1. Associate to the weak formulation obtained with (1.5), an initial value problem.

Step2. Use an operator splitting to time discretize the equations obtained in step 1.

The idea behind the consideration of this algorithm is to recover the solution of the steady equations from the solution of the evolution equations as the time goes to infinity. It is then apparent that when step 1 is achieved with the desired long term behavior, any time discretization can be used to perform step 2 since in the long run we will obtained a steady solution. One of the advantage of using this approach is that a linear scheme can be formulated for a nonlinear problem. Another important aspect worth to acknowledge in the algorithm described is the uniqueness of solution for the steady problem that we obtain as \(t\rightarrow \infty \). Indeed, if we do not have uniqueness of solution, the notion of convergence will be vague, but E.J. Dean and R. Glowinski in [28] have managed to approximate solution using the procedure even when the uniqueness of the steady problem is lacking.

In this work, we discretize the time based on semi-implicit scheme, and conditions for convergence of these algorithms are established. The features and performances of these algorithms are illustrated through a number of numerical examples. The rest of this work is as follows. In Sect. 2, we recall or introduce some notations, preliminary results and introduce our penalized weak formulation. In Sect. 3, we introduce and study the mixed penalized approximation method and discuss the error estimates. Section 4 contains the description and analysis of the algorithm used to compute the solution, as well as some computational results. Our findings and future projects are reported in Sect. 5.

2 Preliminaries and Variational Formulations

In this section, we introduce notations and some results that will be used throughout our work. We also formulate various weak formulations and discuss (recall) some existence results.

2.1 Notations and Preliminaries

\(\Omega \) is a bounded domain in \(\mathbb {R}^d\) with a regular boundary \(\partial \Omega =\Gamma \). The Lebesgue space is denoted as usual \(L^r(\Omega )\), \(1\le r\le \infty \), with norms \(\Vert \cdot \Vert _{L^r}\) (except the \(L^2(\Omega )\)-norm which is denoted by \(\Vert \cdot \Vert \)). For any non-negative integer m and real number \(r\ge 1\), the classical Sobolev spaces [29]:

$$\begin{aligned} W^{m,r}(\Omega )=\{v\in L^r(\Omega );~\partial ^\alpha \,v\in L^r(\Omega )~~\text {for all}\quad \,|\alpha |\le m\}\,, \end{aligned}$$

equipped with the seminorm

$$\begin{aligned} |v|_{m,r}=\left\{ \sum _{|\alpha |=m}\int _{\Omega }|\partial ^\alpha \,v|^r\text {dx}\right\} ^{1/r}, \end{aligned}$$
(2.1)

and norm

$$\begin{aligned} \Vert v\Vert _{m,r}=\left\{ \sum _{0\le |\alpha |\le m}|v|^{r}_{W^{k,r}(\Omega )}\right\} ^{1/r}, \end{aligned}$$
(2.2)

with the usual extension when \(r=\infty \). When \(r=2\), \(W^{m,r}(\Omega )\) is the Hilbert space \(H^m(\Omega )\) with the scalar product

$$\begin{aligned} ((v,w))_m=\sum _{|\alpha |\le m}(\partial ^\alpha v,\partial ^\alpha w)~. \end{aligned}$$

It should be mentioned that \(\partial ^\alpha \) stands for the derivative in the sense of distribution, while \(\alpha =(\alpha _1,\cdots ,\alpha _d)\) denotes a multi-index of length \(|\alpha |=\alpha _1+\cdots +\alpha _d\). \(W_0^{m,r}(\Omega )\) consists of functions of \(W^{m,r}(\Omega )\) that vanish on \(\partial \Omega \). Throughout this work, boldface characters denote vector quantities, and \({\varvec{H}}^1(\Omega )\equiv H^1(\Omega )^d\) and \({\varvec{L}}^2(\Omega )=L^2(\Omega )^d\). During the course, of this study, we will use Poincaré-Fredrichs’s inequality, which state that; there exists a positive constant C, such that

$$\begin{aligned} \int _\Omega {\varvec{v}}^r\text {dx}\le C\int _\Omega |\nabla {\varvec{v}}|^r\text {dx}~~\text{ for } \text{ all }~~{\varvec{v}}\in {\varvec{W}}^{m,r}_{0}(\Omega ), \end{aligned}$$
(2.3)

which implies that the semi-norm (2.1) defines a norm which is equivalent to the norm (2.2). For \(1<r<\infty \), let \(r'\) denote the conjugate of r given by the relation \(r'=r(r-1)^{-1}\). \(W^{-m,r'}(\Omega )\) denote the topological dual space of \(W_0^{m,r}(\Omega )\), and \(\Vert \cdot \Vert _{-m,r'}\) its norm. \(\langle \cdot ,\cdot \rangle \) denotes the duality pairing between \(W_0^{m,r}(\Omega )\) and \(W^{-m,r'}(\Omega )\) as well as between \(L^r(\Omega )\) and \(L^{r'}(\Omega )\). Also of importance here is the Korn’s inequality which reads; there exists a positive constant C, such that

$$\begin{aligned} \int _\Omega |\nabla {\varvec{v}}|^r\text {dx}\le C\int _\Omega |{\varvec{D}}({\varvec{v}})|^r\text {dx}~~\text{ for } \text{ all }~~{\varvec{v}}\in {\varvec{W}}^{m,r}_{0}(\Omega ). \end{aligned}$$
(2.4)

Since we will be working with the pressure, it is convenient to introduce the following space

$$\begin{aligned} L_0^r(\Omega )=\left\{ q\in L^r(\Omega )~,~~ (q,1)=0\right\} . \end{aligned}$$

2.2 Mixed Variational Formulations

In this paragraph, we formulate various equivalent variational models associated to problem (1.1), and its penalized form (1.5). We also indicate how existence and uniqueness of solution is obtained.

Starting with (1.1), its variational formulation is classical and proceed as follows. We weaken (1.1)\(_1\) by Green’s formula. It follows from the nonlinear term in (1.1)\(_1\) that the velocity \({\varvec{u}}\) and the test function \({\varvec{v}}\) must belong to \({\varvec{W}}_0^{1,r}(\Omega )\); then the pressure must belong to \(L^{r'}(\Omega )\). Also from (1.1)\(_2\), since \(\mathrm{div }{\varvec{u}}\) is an element of \(L^r(\Omega )\), it follows that the trial function q must be in \(L^{r'}(\Omega )\). Before going further, it is important to notice that for all \(r\ge 2\), the space \({\varvec{W}}_0^{1,r}(\Omega )\) is continuously imbedded in \({\varvec{L}}^2(\Omega )\). But these results do not hold for \(1<r\le 2\), except when \(\displaystyle \frac{1}{r}-\frac{1}{d}\le \frac{1}{2}\) (see [25]). Thus, we introduce the following spaces

$$\begin{aligned} {\varvec{X}}={\varvec{W}}_0^{1,r}(\Omega )\cap {\varvec{L}}^2(\Omega )~~,~~~ Q=L_0^{r'}(\Omega ), \end{aligned}$$

where the zero mean-value condition is added because the pressure is only defined by (1.1) up to an additive constant. The weak formulation reads: Find \(({\varvec{u}},p)\in {\varvec{X}}\times Q\) such that

$$\begin{aligned} \begin{array}{llll} \langle A({\varvec{u}}),{\varvec{v}}\rangle -b({\varvec{v}},p)&{}=&{} \langle {\varvec{f}},{\varvec{v}}\rangle ~&{}~\text{ for } \text{ all }~~{\varvec{v}}\in {\varvec{X}}\\ b({\varvec{u}},q)&{}=&{}0~&{}~\text{ for } \text{ all }~~q\in Q, \end{array} \end{aligned}$$
(2.5)

with \({\varvec{f}}\in {\varvec{X}}^*={\varvec{W}}^{-1,r'}(\Omega )\), \(A: {\varvec{W}}_0^{1,r}(\Omega ) \rightarrow {\varvec{W}}^{-1,r'}(\Omega )\) is given by

$$\begin{aligned} \langle A({\varvec{u}}),{\varvec{v}}\rangle =\nu (|{\varvec{D}}({\varvec{u}})|^{r-2}{\varvec{D}}({\varvec{u}}),{\varvec{D}}({\varvec{v}})), \end{aligned}$$

and \(b:{\varvec{X}}\times Q\rightarrow \mathbb {R}\) is given by \(b({\varvec{u}},p)=(p,\mathrm{div }{\varvec{u}})\).

We also know that \({\varvec{u}}\) is the solution of the optimization problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \text{ Find }~{\varvec{u}}\in {\varvec{X}}_{\mathrm{div }}(\Omega )~\text{ such } \text{ that }\\ J({\varvec{u}})\le J({\varvec{w}})~~\text{ for } \text{ all }~~{\varvec{w}}\in {\varvec{X}}_{\mathrm{div }}(\Omega ), \end{array}\right. } \end{aligned}$$
(2.6)

where

$$\begin{aligned} J({\varvec{w}})=\displaystyle \frac{\nu }{r}\Vert {\varvec{D}}({\varvec{w}})\Vert ^{r}_{L^r} -\langle {\varvec{f}},{\varvec{w}}\rangle \quad \text {and}\quad {\varvec{X}}_{\mathrm{div }}(\Omega )=\{{\varvec{w}}\in {\varvec{X}}\,,\quad \mathrm{div }{\varvec{w}}|_\Omega =0\}. \end{aligned}$$

One readily observes that (2.6) is equivalent to

$$\begin{aligned} {\left\{ \begin{array}{ll} \text{ Find }~~{\varvec{u}}\in {\varvec{X}}_{\mathrm{div }}(\Omega )~\text{ such } \text{ that }\\ \langle A({\varvec{u}}),{\varvec{w}}\rangle =\langle {\varvec{f}},{\varvec{w}}\rangle ~~\text{ for } \text{ all }~~{\varvec{w}}\in {\varvec{X}}_{\mathrm{div }}(\Omega ). \end{array}\right. } \end{aligned}$$
(2.7)

To show the existence and uniqueness of solution of (2.5), we recall below some classical results

Lemma 2.1

(see [25])) Suppose \(r\in (1,\infty )\), \(\Omega \subset \mathbb {R}^d\), with \(d\ge 1\) and \(\Omega \) bounded. Then \({\varvec{W}}^{1,r}(\Omega )\subset {\varvec{L}}^2(\Omega )\) is compact for \(r\in (2d/(d+2),\infty )\). Moreover, for every \(r\in [2d/(d+2),\infty )\), there is a constant \(C=C(\Omega ,d,r)\) such that for all \({\varvec{v}}\in {\varvec{X}}\),

$$\begin{aligned} C\Vert {\varvec{v}}\Vert \le \Vert {\varvec{v}}\Vert _{1,r}. \end{aligned}$$

Next, the following lemma is proved in [4, 8, 30].

Lemma 2.2

There exist positive constants \(\alpha =\alpha (r,d)\) and \(\beta =\beta (r,d)\), such that the following inequalities hold:

If \(r\in (1,2)\), then for all \({\varvec{u}},{\varvec{v}}\) in \({\varvec{X}}\),

$$\begin{aligned} \alpha \Vert {\varvec{u}}-{\varvec{v}}\Vert _{1,r}^{2}\le & {} \langle A({\varvec{u}})-A({\varvec{v}}),{\varvec{u}}-{\varvec{v}}\rangle \left( \Vert {\varvec{u}}\Vert _{1,r}+\Vert {\varvec{v}}\Vert _{1,r}\right) ^{2-r},\\ \Vert A({\varvec{u}})-A({\varvec{v}})\Vert _{-1,r'}\le & {} \beta \Vert {\varvec{u}}-{\varvec{v}}\Vert _{1,r}^{r-1}. \end{aligned}$$

If \(r\in (2,\infty )\), then for all \({\varvec{u}},{\varvec{v}}\) in \({\varvec{X}}\),

$$\begin{aligned} \alpha \Vert {\varvec{u}}-{\varvec{v}}\Vert _{1,r}^{r}\le & {} \langle A({\varvec{u}})-A({\varvec{v}}),{\varvec{u}}-{\varvec{v}}\rangle ,\\ \Vert A({\varvec{u}})-A({\varvec{v}})\Vert _{-1,r'}\le & {} \beta \Vert {\varvec{u}}-{\varvec{v}}\Vert _{1,r}\left( \Vert {\varvec{u}}\Vert _{1,r}+\Vert {\varvec{v}}\Vert _{1,r}\right) ^{r-2}. \end{aligned}$$

Now, using the classical Browder-Minty theory (see J.L. Lions [25]), with Lemma 2.1 and Lemma 2.2, one can show that;

Lemma 2.3

[5, 8] The mixed variational problem (2.5) admits a unique solution \(({\varvec{u}},p)\in {\varvec{X}}\times Q\), which moreover satisfies the following bound

$$\begin{aligned} \Vert {\varvec{u}}\Vert +\Vert {\varvec{u}}\Vert _{1,r}\le & {} C\Vert {\varvec{f}}\Vert ^{1/(r-1)}_{-1,r'}~~~ \end{aligned}$$
(2.8)
$$\begin{aligned} \Vert p\Vert _{L^{r'}}\le & {} C\Vert {\varvec{f}}\Vert _{-1,r'}. \end{aligned}$$
(2.9)

To present the mixed formulation associated with the penalized form (1.5), one observes that the pressure p and trial function q must be in \(M=L_0^r(\Omega )\cap L_0^{r'}(\Omega )\) because of (1.5)\(_2\). We then introduce the bilinear form \(c:M\times M \rightarrow \mathbb {R}\) given by

$$\begin{aligned} c(p,q)=\int _\Omega pq\text {dx}. \end{aligned}$$

Following what we did earlier, the penalized formulation reads: Find \(({\varvec{u}}^\epsilon ,p^\epsilon )\in {\varvec{X}}\times M\) such that

$$\begin{aligned} \begin{array}{llll} \langle A({\varvec{u}}^\epsilon ),{\varvec{v}}\rangle -b({\varvec{v}},p^\epsilon )&{}=&{} \langle {\varvec{f}},{\varvec{v}}\rangle ~&{}~\text {for all}\quad {\varvec{v}}\in {\varvec{X}}\\ b({\varvec{u}}^\epsilon ,q)+\epsilon c(p^\epsilon ,q)&{}=&{}0~&{}~ \text {for all}\quad q\in M \end{array} \end{aligned}$$
(2.10)

To prove the existence and uniqueness of (2.10), it suffices to apply once again the classical Browder–Minty theory (see J.L. Lions [25]). This theory together with Lemma 2.1 and Lemma 2.2 leads to

Lemma 2.4

The mixed variational problem (2.10) admits a unique solution \(({\varvec{u}}^\epsilon ,p^\epsilon )\in {\varvec{X}}\times M\), which moreover satisfies the following bound

$$\begin{aligned} \Vert {\varvec{u}}^\epsilon \Vert _{1,r}\le & {} C\Vert {\varvec{f}}\Vert ^{1/(r-1)}_{-1,r'}~~~ \end{aligned}$$
(2.11)
$$\begin{aligned} \Vert \mathrm{div }{\varvec{u}}^\epsilon \Vert ^2\le & {} C\epsilon \Vert {\varvec{f}}\Vert ^{r/(r-1)}_{-1,r'}~~~ \end{aligned}$$
(2.12)
$$\begin{aligned} \Vert p^\epsilon \Vert _{L^{r'}}\le & {} C\Vert {\varvec{f}}\Vert _{-1,r'}. \end{aligned}$$
(2.13)

Moreover, denoting by \(({\varvec{u}},p)\) the weak solution of (2.5), constructed in Lemma 2.3, one has; for \(1<r<2\),

$$\begin{aligned} \begin{array}{ll} \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert _{1,r}\le C\epsilon ^{\frac{1}{2(3-r)}}\\ \Vert p-p^\epsilon \Vert _{L^{r'}} \le C\epsilon ^{\frac{r-1}{2(3-r)}}, \end{array} \end{aligned}$$
(2.14)

and for \(2<r<\infty \)

$$\begin{aligned} \begin{array}{ll} \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert _{1,r}\le C\epsilon ^{\frac{1}{r}}\\ \Vert p-p^\epsilon \Vert _{L^{r'}} \le C\epsilon ^{\frac{1}{r}}. \end{array} \end{aligned}$$
(2.15)

Proof

The a priori estimates (2.11)–(2.13) are standard, see [5, 8], where similar results are obtained.

From the inf-sup condition on the bilinear form \(b(\cdot ,\cdot )\), one obtains

$$\begin{aligned} \Vert p-p^\epsilon \Vert _{L^{r'}} \le C\sup _{{\varvec{{\small v}}}\in {\varvec{W}}_0^{1,r}(\Omega )}\frac{b({\varvec{v}},p-p^\epsilon )}{\Vert {\varvec{v}}\Vert _{1,r}}. \end{aligned}$$
(2.16)

Subtracting (2.5) from (2.10), yields

$$\begin{aligned} b({\varvec{v}},p-p^\epsilon )=\langle A({\varvec{u}})-A({\varvec{u}}^\epsilon ),{\varvec{v}}\rangle \;\;\;\text {for all}~~{\varvec{v}}, \end{aligned}$$
(2.17)

which inserting into (2.16) and using continuity, gives

$$\begin{aligned} \Vert p-p^\epsilon \Vert _{L^{r'}} \le C \Vert A({\varvec{u}})-A({\varvec{u}}^\epsilon )\Vert _{-1,r'}. \end{aligned}$$
(2.18)

Then with the help of Lemma 2.2, we have

$$\begin{aligned} \Vert p-p^\epsilon \Vert _{L^{r'}} \le \left\{ \begin{array}{l} C \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert ^{r-1}_{1,r} \;\;\;\;1<r<2\\ C \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert _{1,r} \;\;\;\;2< r<\infty ~. \end{array}\right. ~ \end{aligned}$$
(2.19)

First, for \(1<r<2\), we take \({\varvec{v}}={\varvec{u}}-{\varvec{u}}^\epsilon \) in (2.17) and since \(b({\varvec{u}},q)=0\), we have

$$\begin{aligned} \langle A({\varvec{u}})-A({\varvec{u}}^\epsilon ),{\varvec{u}}-{\varvec{u}}^\epsilon \rangle =-b({\varvec{u}}^\epsilon ,p-p^\epsilon )~. \end{aligned}$$
(2.20)

Using Lemma 2.2, (2.20), (2.8), (2.11) and (2.12) we have

$$\begin{aligned} \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert ^2_{1,r}\le & {} C\langle A({\varvec{u}})-A({\varvec{u}}^\epsilon ),{\varvec{u}}-{\varvec{u}}^\epsilon \rangle \left( \Vert {\varvec{u}}\Vert _{1,r}+\Vert {\varvec{u}}^\epsilon \Vert _{1,r}\right) ^{2-r}\\\le & {} C|b({\varvec{u}}^\epsilon ,p-p^\epsilon )|\left( \Vert {\varvec{u}}\Vert _{1,r}+\Vert {\varvec{u}}^\epsilon \Vert _{1,r}\right) ^{2-r}\\\le & {} C\Vert p-p^\epsilon \Vert _{L^{r'}}\Vert \mathrm{div }{\varvec{u}}^\epsilon \Vert _{r}\\\le & {} C\Vert p-p^\epsilon \Vert _{L^{r'}}\Vert \mathrm{div }{\varvec{u}}^\epsilon \Vert \\\le & {} C\Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert ^{r-1}_{1,r}\epsilon ^{\frac{1}{2}}, \end{aligned}$$

which leads to (2.14). Next, for \(2< r<\infty \), we take \({\varvec{v}}={\varvec{u}}-{\varvec{u}}^\epsilon \) in (2.17) and using the fact that \(b(bu-{\varvec{u}}^\epsilon ,p-p^\epsilon )=\epsilon c(p^\epsilon ,p-p^\epsilon )\), we obtain

$$\begin{aligned} \langle A({\varvec{u}})-A({\varvec{u}}^\epsilon ),{\varvec{u}}-{\varvec{u}}^\epsilon \rangle = \epsilon c(p^\epsilon ,p-p^\epsilon ) \end{aligned}$$

that is

$$\begin{aligned} \langle A({\varvec{u}})-A({\varvec{u}}^\epsilon ),{\varvec{u}}-{\varvec{u}}^\epsilon \rangle + \epsilon c(p^\epsilon ,p^\epsilon )= \epsilon c(p^\epsilon ,p)~. \end{aligned}$$

Since \(\epsilon c(p^\epsilon ,p^\epsilon )\ge 0\), we get from Lemma 2.2

$$\begin{aligned} \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert ^{r}_{1,r}\le C \langle A({\varvec{u}})-A({\varvec{u}}^\epsilon ),{\varvec{u}}-{\varvec{u}}^\epsilon \rangle \le C\epsilon \Vert p^\epsilon \Vert _{L^r}\Vert p\Vert _{L^{r'}} \end{aligned}$$

and obtain (2.15) from (2.9) and (2.13). \(\square \)

3 Finite Element Approximation of the Penalty Method

3.1 Preliminaries and Existence of Solution

In this section, we analyze the finite element discretization of the penalty method (2.10). We assume that \(\mathcal {T}_h\) is a regular partition of \(\Omega \) in the sense introduced in [31]. The diameter of an element \(K \in \mathcal {T}_h\) is denoted by \(h_K\), and the mesh size h is defined by \(h=\max _{K\in \mathcal {T}_h}h_K\).

Let \({\varvec{X}}_h\subset {\varvec{X}}\) and \(Q_h\subset Q\) be two conforming finite element spaces that will be made precise later. The mixed weak formulation for the finite element discretization of the penalty method (2.10) reads: Find \(({\varvec{u}}_h^\epsilon ,p_h^\epsilon )\in {\varvec{X}}_h\times Q_h\) such that

$$\begin{aligned} \begin{array}{ll} \langle A({\varvec{u}}_h^\epsilon ),{\varvec{v}}_h\rangle -b({\varvec{v}}_h,p_h^\epsilon )= \langle {\varvec{f}},{\varvec{v}}_h\rangle &{}\;\;\;\;\text {for all}~ {\varvec{v}}_h\in {\varvec{X}}_h\\ b({\varvec{u}}_h^\epsilon ,q_h)+\epsilon c(p_h^\epsilon ,q_h)=0 &{} \;\;\;\;\;\;\text {for all}~q_h\in Q_h. \end{array} \end{aligned}$$
(3.1)

For the existence and uniqueness of (3.1), it is then manifest that, we need to apply the discrete version of Browder-Minty theory, with the special requirement that the constant in the discrete counterpart of the inf-sup condition on \(b(\cdot ,\cdot )\) be independent of h. Indeed, the reader can consult [1, 2], where many examples of elements pair that satisfy the discrete version of that condition are given. To summarize, we have the following result

Lemma 3.1

The finite element formulation (3.1) has a unique solution \(({\varvec{u}}_h^\epsilon ,p_h^\epsilon )\) which moreover satisfies;

$$\begin{aligned} \Vert {\varvec{u}}_h^\epsilon \Vert _{1,r}\le & {} C\Vert {\varvec{f}}\Vert ^{1/(r-1)}_{-1,r'}~~~ \end{aligned}$$
(3.2)
$$\begin{aligned} \Vert p_h^\epsilon \Vert _{L^{r'}}\le & {} C\Vert {\varvec{f}}\Vert _{-1,r'}, \end{aligned}$$
(3.3)

where \(C>0\) is a generic constant independent on \(\epsilon \) and h.

Proof

First adding (3.1)\(_1\) and (3.1)\(_2\) and letting \({\varvec{v}}_h={\varvec{u}}_h^\epsilon \) and \(q_h=p_h^\epsilon \), we obtain

$$\begin{aligned} \langle A({\varvec{u}}_h^\epsilon ),{\varvec{u}}_h^\epsilon \rangle +\epsilon c(p_h^\epsilon ,p_h^\epsilon )= \langle {\varvec{f}},{\varvec{u}}_h^\epsilon \rangle \end{aligned}$$

which by Lemma 2.2 leads to

$$\begin{aligned} C\Vert {\varvec{u}}_h^\epsilon \Vert ^r_{1,r}\le \Vert {\varvec{f}}\Vert _{-1,r'}\Vert {\varvec{u}}_h^\epsilon \Vert _{1,r}. \end{aligned}$$

Next from the discrete inf-sup condition on \(b(\cdot ,\cdot )\) and (3.1)\(_1\), one obtains

$$\begin{aligned} \Vert p_h^\epsilon \Vert _{L^{r'}}\le & {} C\sup _{{\varvec{{\small v}}}_h\in {\varvec{X}}_h}\frac{b({\varvec{v}}_h,p_h^\epsilon )}{\Vert {\varvec{v}}_h\Vert _{1,r}} \\= & {} C\sup _{{\varvec{{\small v}}}_h\in {\varvec{X}}_h}\frac{\langle A({\varvec{u}}_h^\epsilon ),{\varvec{v}}_h\rangle -\langle {\varvec{f}},{\varvec{v}}_h\rangle }{\Vert {\varvec{v}}_h\Vert _{1,r}} \\\le & {} C(\Vert A({\varvec{u}}_h^\epsilon )\Vert _{-1,r'}+\Vert {\varvec{f}}\Vert _{-1,r'}), \end{aligned}$$

and deduce the result by applying Lemma 2.2 and (3.2). \(\square \)

We now state and prove one of the the main result of this work about the uniform convergence of the penalty finite element approximations \(({\varvec{u}}_h^\epsilon ,p_h^\epsilon )\) given by (3.1). It should be mentioned that the convergence results obtained in [17, 18] are not uniform with respect to \(\epsilon \). The proof is based on direct application of Babuska-Brezzi’s theory for mixed formulations together with Lemmas 2.1 and 2.2.

3.2 Abstract Convergence Result

We first recall this result which will be useful in this subsection

$$\begin{aligned} |\sum _{i=1}^{m}a_i|^\beta \le C(m,\beta )\sum _{i=1}^{m}|a_i|^\beta \quad \text {for all} \;\;a_i\ge 0~\quad \text {and for all} \quad \beta \ge 0 \end{aligned}$$
(3.4)

where \(C(m,\beta )\) is a positive constant which depends on m and \(\beta \).

Theorem 3.1

Suppose that \(\Omega \) is a bounded convex domain in \(\mathbb {R}^2\) and assume that \({\varvec{f}}\in {\varvec{W}}^{-1,r'}(\Omega )\). For \(0<\epsilon<<1\), let \(({\varvec{u}}^\epsilon ,p^\epsilon )\) be the unique solution of (2.10) and \(({\varvec{u}}_h^\epsilon ,p_h^\epsilon )\) the unique solution of (3.1). Then there exists a generic positive constant C independent of h and \(\epsilon \) such that for all \({\varvec{w}}_h \in {\varvec{X}}_h\) and \(q_h \in Q_h\), there hold;

\(\bullet \) For \(1<r< 2\),

$$\begin{aligned} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}\le & {} C\left\{ \begin{array}{llll} &{}\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{r-1}_{1,r}+ \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{r/2}_{1,r} +\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{\frac{1}{(3-r)}}_{1,r}\\ &{}+\Vert p^\epsilon -q_h\Vert ^{\frac{1}{(r-1)}}_{L^{r'}}+\epsilon ^{1/2}\Vert p^\epsilon -q_h\Vert _{L^{r'}}+ \epsilon ^{1/r}\Vert p^\epsilon -q_h\Vert ^{2/r}_{L^{r'}}\\ &{}+\epsilon ^{\frac{1}{(3-r)(r-1)}}\Vert p^\epsilon -q_h\Vert ^{\frac{1}{(3-r)(r-1)}}_{L^{r'}} \end{array}\right\} ,\nonumber \\\nonumber \\ \\ \Vert p^\epsilon -p_h^\epsilon \Vert _{L^{r'}}\le & {} C\left\{ \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^{(r-1)}_{1,r}+\Vert p^\epsilon -q_h\Vert _{L^{r'}}\right\} \nonumber ~. \end{aligned}$$
(3.5)

\(\bullet \) For \(2<r<\infty \),

$$\begin{aligned} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^r_{1,r}\le & {} C\left\{ \begin{array}{l} \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^2_{1,r} +\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{r'}_{1,r}\nonumber \\ +\epsilon \Vert p^\epsilon -q_h\Vert ^2_{L^{r'}}+ \epsilon ^{r'}\Vert p^\epsilon -q_h\Vert ^{r'}_{L^{r'}} \end{array}\right\} ,\\\nonumber \\ \\ \Vert p^\epsilon -p_h^\epsilon \Vert _{L^{r'}}\le & {} C\left\{ \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}+\Vert p^\epsilon -q_h\Vert _{L^{r'}}\right\} .\nonumber \end{aligned}$$
(3.6)

Proof

Subtracting (3.1) from (2.10) with \({\varvec{v}}={\varvec{v}}_h\) and \(q=q_h=t_h\) we obtain

$$\begin{aligned} \begin{array}{ll} \langle A({\varvec{u}}^\epsilon )- A({\varvec{u}}_h^\epsilon ),{\varvec{v}}_h\rangle -b({\varvec{v}}_h,p^\epsilon -p_h^\epsilon )= 0&{}\;\;\;\;\forall {\varvec{v}}_h\in {\varvec{X}}_h\\ b({\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon ,t_h)+\epsilon c(p^\epsilon -p_h^\epsilon ,t_h)=0 &{} \;\;\;\;\;\;\forall t_h\in Q_h~. \end{array} \end{aligned}$$
(3.7)

From (3.7)\(_1\), we have

$$\begin{aligned} b({\varvec{v}}_h,p_h^\epsilon -q_h)=\langle A({\varvec{u}}^\epsilon )- A({\varvec{u}}_h^\epsilon ),{\varvec{v}}_h\rangle -b({\varvec{v}}_h,p^\epsilon -q_h), \end{aligned}$$

which together with the discrete version of the inf-sup condition on \(b(\cdot ,\cdot )\), and after using Hölder’s inequality, leads to

$$\begin{aligned} \beta \Vert p_h^\epsilon -q_h\Vert _{L^{r'}}\le & {} \sup _{{\varvec{{\small w}}}_h\in {\varvec{X}}_h}\frac{|\langle A({\varvec{u}}^\epsilon )- A({\varvec{u}}_h^\epsilon ),{\varvec{w}}_h\rangle -b({\varvec{w}}_h,p^\epsilon -q_h)|}{\Vert {\varvec{w}}_h\Vert _{1,r}}\\\le & {} C(\Vert A{\varvec{u}}^\epsilon -A{\varvec{u}}_h^\epsilon \Vert _{-1,r'}+\Vert p^\epsilon -q_h\Vert _{L^{r'}})~. \end{aligned}$$

Next, using \(\Vert p^\epsilon -p_h^\epsilon \Vert _{L^{r'}}\le \Vert p^\epsilon -q_h\Vert _{L^{r'}}+\Vert q_h-p_h^\epsilon \Vert _{L^{r'}}\), yields

$$\begin{aligned} C\Vert p^\epsilon -p_h^\epsilon \Vert _{L^{r'}}\le \Vert A{\varvec{u}}^\epsilon -A{\varvec{u}}_h^\epsilon \Vert _{-1,r'}+\Vert p^\epsilon -q_h\Vert _{L^{r'}} \end{aligned}$$

where C is a generic constant independent of h and \(\epsilon \). Now if we apply Lemma 2.2 we obtain

$$\begin{aligned} C\Vert p^\epsilon -p_h^\epsilon \Vert _{L^{r'}}\le \left\{ \begin{array}{l}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^{r-1}_{1,r} +\Vert p^\epsilon -q_h\Vert _{L^{r'}} \quad \quad 1<r<2\\ \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r} +\Vert p^\epsilon -q_h\Vert _{L^{r'}} \quad \quad 2<r< \infty ~. \end{array}\right. \end{aligned}$$
(3.8)

From (3.7)\(_1\), by adding and subtracting \(A({\varvec{w}}_h)\), we have

$$\begin{aligned} \langle A({\varvec{u}}^\epsilon )- A({\varvec{w}}_h),{\varvec{v}}_h\rangle +\langle A({\varvec{w}}_h)- A({\varvec{u}}_h^\epsilon ),{\varvec{v}}_h\rangle -b({\varvec{v}}_h,p^\epsilon -p_h^\epsilon )= 0\;\;\;\;\forall {\varvec{v}}_h,{\varvec{w}}_h\in {\varvec{X}}_h~.\nonumber \\ \end{aligned}$$
(3.9)

From (3.9), by taking \({\varvec{v}}_h={\varvec{w}}_h-{\varvec{u}}_h^\epsilon \) and adding and subtracting \(q_h\) gives

$$\begin{aligned} \begin{array}{l} \langle A({\varvec{u}}^\epsilon )- A({\varvec{w}}_h),{\varvec{w}}_h-{\varvec{u}}_h^\epsilon \rangle +\langle A({\varvec{w}}_h)-A({\varvec{u}}_h^\epsilon ),{\varvec{w}}_h-{\varvec{u}}_h^\epsilon \rangle \\ -b({\varvec{w}}_h-{\varvec{u}}_h^\epsilon ,p^\epsilon -q_h)-b({\varvec{w}}_h-{\varvec{u}}_h^\epsilon ,q_h-p_h^\epsilon )= 0. \end{array} \end{aligned}$$
(3.10)

Equation (3.7)\(_2\) can be written as

$$\begin{aligned} b({\varvec{u}}^\epsilon -{\varvec{w}}_h,t_h)+b({\varvec{w}}_h-{\varvec{u}}_h^\epsilon ,t_h)+\epsilon c(p^\epsilon -q_h,t_h)+\epsilon c(q_h-p_h^\epsilon ,t_h)=0~, \end{aligned}$$

which by taking \(t_h=q_h-p_h^\epsilon \) implies that

$$\begin{aligned}&b({\varvec{u}}^\epsilon -{\varvec{w}}_h,q_h-p_h^\epsilon )+b({\varvec{w}}_h-{\varvec{u}}_h^\epsilon ,q_h-p_h^\epsilon ) +\epsilon c(p^\epsilon -q_h,q_h-p_h^\epsilon )\nonumber \\&\quad +\,\epsilon c(q_h-p_h^\epsilon ,q_h-p_h^\epsilon )=0 \end{aligned}$$
(3.11)

Now performing (3.10)+(3.11), and observing that \(\epsilon c(q_h-p_h^\epsilon ,q_h-p_h^\epsilon )\ge 0\) together with the fact that the nonlinear operator \(A(\cdot )\) is monotone, we have

$$\begin{aligned}&\langle A({\varvec{w}}_h)-A({\varvec{u}}_h^\epsilon ),{\varvec{w}}_h-{\varvec{u}}_h^\epsilon \rangle \nonumber \\&\quad \le -\langle A({\varvec{u}}^\epsilon )- A({\varvec{w}}_h),{\varvec{w}}_h-{\varvec{u}}_h^\epsilon \rangle +b({\varvec{w}}_h-{\varvec{u}}_h^\epsilon ,p^\epsilon -q_h)\nonumber \\&\qquad -b({\varvec{u}}^\epsilon -{\varvec{w}}_h,q_h-p_h^\epsilon )-\epsilon c(p^\epsilon -q_h,q_h-p_h^\epsilon )\nonumber \\&\quad \le \Vert A{\varvec{u}}^\epsilon -A{\varvec{w}}_h\Vert _{-1,r'}\Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert _{1,r}+C\Vert p^\epsilon -q_h\Vert _{L^{r'}} \Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert _{1,r}\nonumber \\&\qquad +C\Vert p_h^\epsilon -q_h\Vert _{L^{r'}} \Vert {\varvec{w}}_h-{\varvec{u}}^\epsilon \Vert _{1,r}+\epsilon \Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert p_h^\epsilon -q_h\Vert _{L^{r'}}~. \end{aligned}$$
(3.12)

At this step, we distinguish two possibilities.

\(\bullet \) \(1<r< 2\)

We use the first part of Lemma 2.2 both to the left hand side of (3.12), and to first term of the right hand side of (3.12) and get

$$\begin{aligned}&\Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert ^2_{1,r}\le C\left( \Vert {\varvec{w}}_h\Vert _{1,r}+\Vert {\varvec{u}}_h^\epsilon \Vert _{1,r}\right) ^{2-r}\nonumber \\&\left\{ \begin{array}{l} \Vert {\varvec{w}}_h-{\varvec{u}}^\epsilon \Vert ^{r-1}_{1,r} \Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert _{1,r} +\Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert _{1,r} \\ +\Vert p_h^\epsilon -q_h\Vert _{r'} \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}+\epsilon \Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert p_h^\epsilon -q_h\Vert _{L^{r'}}~. \end{array}\right\} \end{aligned}$$
(3.13)

\(\bullet \) \(2< r<\infty \)

We use the second part of Lemma 2.2 on both the left hand side and to first term of the right hand side of (3.12). We find

$$\begin{aligned}&\Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert ^r_{1,r}\nonumber \\&\quad \le \left\{ \begin{array}{l} C(\Vert {\varvec{w}}_h\Vert _{1,r}+\Vert {\varvec{u}}_h^\epsilon \Vert _{1,r})^{r-2} \Vert {\varvec{w}}_h-{\varvec{u}}^\epsilon \Vert _{1,r} \Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert _{1,r} +\Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert _{1,r} \\ +\Vert p_h^\epsilon -q_h\Vert _{L^{r'}} \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}+\epsilon \Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert p_h^\epsilon -q_h\Vert _{L^{r'}}~. \end{array}\right\} \nonumber \\ \end{aligned}$$
(3.14)

Next, using the triangle inequalities

$$\begin{aligned} \begin{array}{l} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}\le \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}+ \Vert {\varvec{w}}_h-{\varvec{u}}_h^\epsilon \Vert _{1,r}\\ \Vert {\varvec{w}}_h\Vert _{1,r} \le \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}+\Vert {\varvec{u}}^\epsilon \Vert _{1,r}\\ \Vert p^\epsilon -p_h^\epsilon \Vert _{L^{r'}}\le \Vert p^\epsilon -q_h\Vert _{L^{r'}}+\Vert q_h-p_h^\epsilon \Vert _{L^{r'}}, \end{array} \end{aligned}$$
(3.15)

the relation (3.4), the bounded relations (2.11), (3.2) and (3.8)\(_1\) in (3.13), we obtain for \(1<r< 2\)

$$\begin{aligned}&C\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^2_{1,r}\nonumber \\&\quad \le 2\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^2_{1,r}+\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^r_{1,r} +2\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{r-1}_{1,r}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h\epsilon \Vert _{1,r}\nonumber \\&\qquad +\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}\Vert p^\epsilon -q_h\Vert _{L^{r'}}+\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r} \Vert p^\epsilon -q_h\Vert _{L^{r'}}+\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^{r-1}_{1,r}\nonumber \\&\qquad +\epsilon \Vert p^\epsilon -q_h\Vert ^2_{L^{r'}}+\epsilon \Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^{r-1}_{1,r}+ \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}\nonumber \\&\qquad +\Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{3-r}_{1,r}+\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{2-r}_{1,r}\Vert p^\epsilon -q_h\Vert _{L^{r'}} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}\\&\qquad +\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{3-r}_{1,r}\Vert p^\epsilon -q_h\Vert _{L^{r'}}+\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{3-r}_{1,r} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^{r-1}_{1,r}\nonumber \\&\qquad +2\epsilon \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{2-r}_{1,r}\Vert p^\epsilon -q_h\Vert ^2_{L^{r'}}+\epsilon \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{2-r}_{1,r} \Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^{r-1}_{1,r},\nonumber \end{aligned}$$
(3.16)

which with Young’s inequality, leads to the desired bound for \(1<r< 2\).

Likewise, when \(2< r<\infty \), using the triangle inequalities (3.15), the relation (3.4), the inequalities (2.11), (3.2) and (3.8)\(_2\) in (3.14), we obtain

$$\begin{aligned} C\Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert ^r_{1,r}\le & {} \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^r_{1,r}+\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^2_{1,r}+ \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert ^{r-1}_{1,r}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}^\epsilon _h\Vert _{1,r}\nonumber \\&+\Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}^\epsilon _h\Vert _{1,r}+ \Vert {\varvec{u}}^\epsilon -{\varvec{w}}_h\Vert _{1,r}\Vert p^\epsilon -q_h\Vert _{L^{r'}}\\&+\Vert {\varvec{u}}^\epsilon -{\varvec{u}}^\epsilon _h\Vert _{1,r} \Vert p^\epsilon -q_h\Vert _{L^{r'}} +\epsilon \Vert p^\epsilon -q_h\Vert ^2_{L^{r'}}\nonumber \\&+\,\epsilon \Vert p^\epsilon -q_h\Vert _{L^{r'}}\Vert {\varvec{u}}^\epsilon -{\varvec{u}}^\epsilon _h\Vert _{1,r},\nonumber \end{aligned}$$
(3.17)

which again with the help of Young’s inequality yields the desired bound for \(2< r<\infty \). \(\square \)

3.3 Rate of Convergence

Let us consider in this section some finite element approximations defined in J.W. Barrett and W.B. Liu [8]. We state the following assumptions for integers \(n,m\ge 1\) and any \(\alpha \in [1,\infty )\).

(H1): Approximation property of \({\varvec{X}}_h\)

There is a continuous linear operator \(\pi _h:{\varvec{W}}_0^{1,\alpha }(\Omega )\rightarrow {\varvec{X}}_h\) such that for \(k=0,\cdots ,n\),

$$\begin{aligned} \Vert {\varvec{w}}-\pi _h{\varvec{w}}\Vert _{1,\alpha } \le Ch^k\Vert {\varvec{w}}\Vert _{k+1,\alpha } \quad \quad \forall {\varvec{w}}\in {\varvec{W}}^{k+1,\alpha }(\Omega )\cap {\varvec{W}}_0^{1,\alpha }(\Omega )~. \end{aligned}$$
(3.18)

(H2): Approximation property of \(Q_h\)

There is a continuous linear operator \(\rho _h:L^{\alpha }(\Omega )\rightarrow Q_h\) such that for all \(k=0,\cdots ,m\),

$$\begin{aligned} \Vert q-\rho _hq\Vert _{L^\alpha } \le Ch^k\Vert q\Vert _{k,\alpha } \quad \quad \forall q\in W^{k,\alpha }(\Omega )~. \end{aligned}$$
(3.19)

(H3): The spaces \(Q_h\) and \({\varvec{X}}_h\) satisfy the inf-sup condition; there is a constant C independent of h such that

$$\begin{aligned} C\Vert q_h\Vert _{L^{r'}}\le \sup _{{\varvec{{\small v}}}_h\in {\varvec{X}}_h}\displaystyle \frac{b({\varvec{v}}_h,q_h)}{\Vert {\varvec{v}}_h\Vert _{1,r}}\quad \text {for all}~~q_h \in M_h. \end{aligned}$$

Theorem 3.2

Assume that the unique solution of (2.10) is such that \(({\varvec{u}}^\epsilon ,p^\epsilon )\) be in \({\varvec{W}}^{k+1,r}(\Omega )\times W^{k,r'}(\Omega )\). Let \(({\varvec{u}}_h^\epsilon ,p_h^\epsilon )\) be the unique solution of (3.1). Assume that (H1), (H2) and (H3) hold. Then for \(k=0,\cdots , m\) we have:

For \(1<r< 2\)

$$\begin{aligned} \begin{array}{l} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}\le C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r}\Vert p^\epsilon \Vert _{k,r'}) \left\{ h^{k(r-1)}+\epsilon ^{\frac{2}{3-r}}\right\} \\ \Vert p^\epsilon -p_h^\epsilon \Vert _{r'}\le C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'}) \left\{ h^{k(r-1)^2}+\epsilon ^{\frac{2(r-1)}{3-r}}\right\} . \end{array} \end{aligned}$$
(3.20)

For \(2<r< \infty \)

$$\begin{aligned} \begin{array}{l} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}\le C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'}) \left\{ h^{\frac{k}{r-1}}+\epsilon ^{\frac{2}{r-1}}\right\} \\ \Vert p^\epsilon -p_h^\epsilon \Vert _{r'}\le C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'}) \left\{ h^{\frac{k}{r-1}}+\epsilon ^{\frac{2}{r-1}}\right\} , \end{array} \end{aligned}$$
(3.21)

where \(C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'})\) is a generic constant depending on \(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r}\), \(\Vert p^\epsilon \Vert _{k,r'}\) and independent of h and \(\epsilon \).

Proof

Taking \({\varvec{w}}_h=\pi _h{\varvec{u}}\) and \(q_h=\rho _hu\) in (3.5), (3.6) and applying (3.18) and (3.19), we obtain:

For \(1<r< 2\)

$$\begin{aligned} \begin{array}{lll} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r}&{}\le &{}C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'}) \left\{ h^k+h^{kr/2}+h^{k(r-1)}+h^{k/(3-r)}\right. \\ &{}&{}+\left. h^{k/(r-1)}+\epsilon ^{1/2}h^k \epsilon ^{1/r}h^{2k/r}+\epsilon ^{\frac{1}{(3-r)(r-1)}}h^{\frac{k}{(3-r)(r-1)}}\right\} \\ &{}\le &{}C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'}) \left\{ h^{k(r-1)}+\epsilon ^{\frac{2}{3-r}}\right\} ~, \end{array} \end{aligned}$$

where Young’s inequality has been used.

For \(2< r<\infty \)

$$\begin{aligned} \begin{array}{lll} \Vert {\varvec{u}}^\epsilon -{\varvec{u}}_h^\epsilon \Vert _{1,r} &{}\le &{}C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'}) \left\{ h^{k}+h^{2k/r}+h^{k/(r-1)}+\epsilon ^{1/r}h^{2k/r}+\epsilon ^{\frac{1}{(r-1)}}h^{\frac{k}{(r-1)}}\right\} \\ &{}\le &{}C(\Vert {\varvec{u}}^\epsilon \Vert _{k+1,r},\Vert p^\epsilon \Vert _{k,r'})\left\{ h^{k/(r-1)}+\epsilon ^{\frac{2}{r-1}}\right\} , \end{array} \end{aligned}$$

where the last expression is obtained by the same techniques as above. \(\square \)

Remark 3.1

It should be noticed that our result goes along the same lines as those obtained previously by [6, 7] using the direct approach. Of course, from triangle’s inequality one deduces that

$$\begin{aligned} \Vert {\varvec{u}}-{\varvec{u}}_h\Vert _{1,r}\le & {} \Vert {\varvec{u}}-{\varvec{u}}^\epsilon \Vert _{1,r}+\Vert {\varvec{u}}^\epsilon -{\varvec{u}}^{\epsilon }_{h}\Vert _{1,r}+ \Vert {\varvec{u}}^{\epsilon }_{h}-{\varvec{u}}_h\Vert _{1,r},\\ \Vert p-p_h\Vert _{r'}\le & {} \Vert p-p^\epsilon \Vert _{r'}+\Vert {\varvec{u}}^\epsilon -{\varvec{u}}^{\epsilon }_{h}\Vert _{1,r}+ \Vert {\varvec{u}}^{\epsilon }_{h}-{\varvec{u}}_h\Vert _{1,r}. \end{aligned}$$

4 Numerical Experiments

In this section, we formulate and implement two strategies for the resolution of (3.1). Since (3.1) is a nonlinear problem, iterative methods such as Newton’s or fixed point algorithm may be used to solve it, but we prefer the strategy pioneered by Roland Glowinski [27]. It consist of the following steps

  1. Step 1.

    associate to the weak formulation (3.1) an initial value problem in \({\varvec{X}}_h\times Q_h\).

  2. Step 2.

    Use operator splitting to time discretize the above initial value problem.

Section 4.1 presents those numerical strategies and their mathematical merits, while Sect. 4.2 is concerned with the numerical experiments.

4.1 Numerical Procedures

Applying the methodology described, we obtain Step 1 by associating with (3.1) the following initial value problem: Given \({\varvec{u}}_0\in {\varvec{X}}\), and assuming that \({\varvec{f}}\in {\varvec{X}}^*\) independent of time, we consider the following problem:

Find \(({\varvec{u}}_h^\epsilon (t),p_h^\epsilon (t))\in {\varvec{X}}_h\times Q_h\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \langle \partial _t{\varvec{u}}_h^{\epsilon }(t),{\varvec{v}}_h\rangle +\langle A{\varvec{u}}_h^\epsilon (t),{\varvec{v}}_h\rangle - b({\varvec{v}}_h,p_h^\epsilon (t))= \langle {\varvec{f}},{\varvec{v}}\rangle \;\;\forall {\varvec{v}}_h\in {\varvec{X}}_h\\ b({\varvec{u}}_h^\epsilon (t),q_h)+\epsilon \,c(p_h^\epsilon (t),q_h)=0,\;\;\forall q_h\in Q_h \\ {\varvec{u}}_h^\epsilon (0)= {\varvec{u}}_0\;, \end{array}\right. } \end{aligned}$$
(4.1)

and we are interested in the limiting behavior of \(({\varvec{u}}_h^\epsilon (t),p_h^\epsilon (t))\) as \(t\rightarrow \infty \). Of course, the choice of \({\varvec{u}}_0\) is equally important and will be discussed below.

(4.1) is a system of ordinary differential equations, hence the existence of solution is standard and the interested reader may consult J.L. Lions [25]. Assume that (4.1) admits a unique solution \(({\varvec{u}}^\epsilon _h(t),p^\epsilon _h(t))\in {\varvec{X}}_h\times Q_h\). Then from the properties on \(A(\cdot )\) and \(b(\cdot ,\cdot )\), it can be shown that

$$\begin{aligned} \Vert {\varvec{u}}^\epsilon _h(t)\Vert _{1,r}\le C(\Vert {\varvec{u}}_0\Vert _{1,r},\Vert {\varvec{f}}\Vert _{-1,r'}). \end{aligned}$$
(4.2)

The next result is crucial and in fact it is the reason why only the long term effect should be considered in the implementation of (4.1). We claim that

Theorem 4.1

The solution \({\varvec{u}}_h^\epsilon (t)\) of (4.1) converges to \({\varvec{u}}_h^\epsilon \) solution of (3.1) exponentially as t goes to infinity. More precisely, we have:

$$\begin{aligned} \Vert {\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon \Vert \le \Vert {\varvec{u}}_0-{\varvec{u}}_h^\epsilon \Vert \exp \displaystyle \left( -C\nu t\right) , \quad \text {for all} \;\; t\ge 0 \end{aligned}$$
(4.3)

where C is a generic positive constant independent of \(\epsilon \) and h.

Proof

We first set \({\varvec{v}}_h={\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon \) in (4.1)\(_1\) and \({\varvec{v}}_h={\varvec{u}}_h^\epsilon -{\varvec{u}}_h^\epsilon (t)\) in (3.1)\(_1\) and putting together the results, we obtain:

$$\begin{aligned} \langle \partial _t{\varvec{u}}_h^{\epsilon }(t),{\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon \rangle +\langle A{\varvec{u}}_h^\epsilon (t)-A{\varvec{u}}_h^\epsilon ,{\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon \rangle -b({\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon ,p_h^\epsilon (t)-p_h^\epsilon )=0\,. \end{aligned}$$

Subtracting (4.1)\(_2\) from (3.1)\(_2\) and taking \(q_h=p_h^\epsilon (t)-p_h^\epsilon \) one obtains:

$$\begin{aligned} b({\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon ,p_h^\epsilon (t)-p_h^\epsilon )+ \epsilon c(p_h^\epsilon (t)-p_h^\epsilon ,p_h^\epsilon (t)-p_h^\epsilon )=0 \end{aligned}$$

by adding these two equations above, we have

$$\begin{aligned} \langle \partial _t{\varvec{u}}_h^{\epsilon }(t),{\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon \rangle +\langle A{\varvec{u}}_h^\epsilon (t)-A{\varvec{u}}_h^\epsilon ,{\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon \rangle +\epsilon c(p_h^\epsilon (t)-p_h^\epsilon ,p_h^\epsilon (t)-p_h^\epsilon )=0\,. \end{aligned}$$

Let us set \({\varvec{w}}_h(t)={\varvec{u}}_h^\epsilon (t)-{\varvec{u}}_h^\epsilon \), using the fact that \(\epsilon c(p_h^\epsilon (t)-p_h^\epsilon ,p_h^\epsilon (t)-p_h^\epsilon )\ge 0\) and Lemma 2.2, it turns out that:

For \(1<r< 2\)

$$\begin{aligned} \frac{d}{dt}\Vert {\varvec{w}}_h(t)\Vert ^2+ 2C\nu (\Vert {\varvec{u}}_h^\epsilon (t)\Vert _{1,r}+\Vert {\varvec{u}}_h^\epsilon \Vert _{1,r})^{r-2}\Vert {\varvec{w}}_h(t)\Vert _{1,r}^2\le 0, \end{aligned}$$

and for \(2<r< \infty \)

$$\begin{aligned} \frac{d}{dt}\Vert {\varvec{w}}_h(t)\Vert ^2+ 2C\nu \Vert {\varvec{w}}_h(t)\Vert _{1,r}^r\le 0. \end{aligned}$$

Integrating each differential inequality having in mind (4.2), and Poincare’s inequality one obtains the desired result. \(\square \)

Finally, concerning Step 2, we propose the following temporal semi-implicit schemes, but we acknowledge that other temporal approximation are possible.

Let \(N\in \mathbb {N}^*\) and set \(k=T/N\). Given \(({\varvec{u}}_h^{\epsilon 0},p_h^{\epsilon 0})\) which is a suitable approximation of \(({\varvec{u}}_0,p_0)\), and knowing \(({\varvec{u}}_h^{\epsilon m-1},p_h^{\epsilon m-1})\), we compute \(({\varvec{u}}_h^{\epsilon m},p_h^{\epsilon m})\) in \({\varvec{X}}_h\times Q_h\) solution of:

$$\begin{aligned} \mathbf{scheme 1}~~ {\left\{ \begin{array}{ll} \displaystyle \frac{1}{k}({\varvec{u}}_h^{\epsilon m}-{\varvec{u}}_h^{\epsilon m-1},{\varvec{v}}_h) +\nu (|{\varvec{D}}({\varvec{u}}_h^{\epsilon m-1})|^{r-2}{\varvec{D}}({\varvec{u}}_h^{\epsilon m}),{\varvec{D}}({\varvec{v}}_h))+\\ -b({\varvec{v}}_h,p_h^{\epsilon m})= ({\varvec{f}},{\varvec{v}}_h)\;\;\forall {\varvec{v}}_h\in {\varvec{X}}_h\\ b({\varvec{u}}_h^{\epsilon m},q_h)+\epsilon c(p_h^{\epsilon m},q_h)=0,\;\;\forall q_h\in Q_h, \end{array}\right. } \end{aligned}$$
(4.4)

or

$$\begin{aligned} \mathbf{scheme 2}~~ {\left\{ \begin{array}{ll} \displaystyle \frac{1}{k}({\varvec{u}}_h^{\epsilon m}-{\varvec{u}}_h^{\epsilon m-1},{\varvec{v}}_h) +\nu (|{\varvec{D}}({\varvec{u}}_h^{\epsilon m-1})|^{r-2}{\varvec{D}}({\varvec{u}}_h^{\epsilon m}),{\varvec{D}}({\varvec{v}}_h))+\\ \mu (\delta )({\varvec{D}}({\varvec{u}}_h^{\epsilon m}),{\varvec{D}}({\varvec{v}}_h))- b({\varvec{v}}_h,p_h^{\epsilon m})= ({\varvec{f}},{\varvec{v}}_h)\;\;\forall {\varvec{v}}_h\in {\varvec{X}}_h\\ b({\varvec{u}}_h^{\epsilon m},q_h)+\epsilon c(p_h^{\epsilon m},q_h)=0,\;\;\forall q_h\in Q_h. \end{array}\right. } \end{aligned}$$
(4.5)

where \(\mu (\delta )\) should be regarded as artificial viscosity such that

$$\begin{aligned} 0<\mu (\delta )<<1 \quad \quad \text {and}\quad \quad \lim _{\delta \rightarrow 0}\mu (\delta )=0. \end{aligned}$$
(4.6)

The introduction of the coercive term \(\mu (\delta )({\varvec{D}}({\varvec{u}}_h^{\epsilon m}),{\varvec{D}}({\varvec{v}}_h))\) in (4.5) has the effect of bringing stability/smoothness to the system. This way of adding artificial viscosity is known in the literature as vanishing viscosity method, and has been used for establishing existence of solutions for partial differential equations by J.L Lions in [25]. But it is the first time as far as the authors are aware that such technique is used in this context. It should be observed that the solution technique via (4.4), though easy to implement is less obvious as far as its convergence analysis is concerned. The reason being the lack of coercivity and hence the introduction of the scheme (4.5). At this point one can anticipate that because of the “extra diffusion” added in scheme (4.5), the numerical simulations via (4.5) should produce more smoothness and better results than solutions obtained via (4.4). But the natural question arising from the scheme (4.5) is the following one: can one recover the solution \(({\varvec{u}}_h^{\epsilon },p_h^{\epsilon })\) of (4.1) from the solution \(({\varvec{u}}_h^{\epsilon m},p_h^{\epsilon m})\) of (4.5) as m tends to infinity? We claim that indeed, the scheme (4.5) provides a reliable solution as \(m\rightarrow \infty \), and our goal now is to show that assertion. For that purpose, we follow R. Temam [23]. One first observe that since (4.5) is a linear system of equations, by rearranging terms, it quickly turns out that it is a square linear system in finite dimension; hence uniqueness implies existence of solution. Uniqueness of \({\varvec{u}}_h^{\epsilon m}\) follows from the estimations obtained in Lemma 4.1, while uniqueness of \(p_h^{\epsilon m}\) is the consequence of the discrete version of the inf-sup condition on \(b(\cdot ,\cdot )\). The convergence result we want to prove can be obtained in two steps. First, one obtains some energy estimates, next we use compactness results and pass to the limit.

We first recall the following [23]:

$$\begin{aligned} \Vert \nabla {\varvec{u}}_h\Vert _{L^r}\le S(h)\Vert {\varvec{u}}_h\Vert _{L^r}\quad \text {for all}\quad {\varvec{u}}_h\in \mathcal{P}_k(T), \end{aligned}$$
(4.7)

where; \(S(h)\rightarrow \infty \) as \(h\rightarrow 0\). We will also need the following version of the Sobolev’s inequality which states that for \(p=2\) and q with \(1\le q\le \infty \), there is C such that

$$\begin{aligned} \Vert {\varvec{v}}\Vert _{L^q}\le C\Vert {\varvec{v}}\Vert _1\quad \text {for all}~~{\varvec{v}}\in {\varvec{H}}^1(\Omega ). \end{aligned}$$
(4.8)

Step 1: Energy estimates   Note that \({\varvec{u}}_h^{\epsilon 0}\) is chosen to satisfy

$$\begin{aligned} \Vert {\varvec{u}}_h^{\epsilon 0}\Vert \le C \Vert {\varvec{u}}_0\Vert \end{aligned}$$

In the sequel, C denotes a positive constant independent of k and h.

Lemma 4.1

There exists C such that for every \(k>0\) and \(h>0\)

$$\begin{aligned} \Vert {\varvec{u}}_h^{\epsilon m}\Vert ^2\le & {} \displaystyle \frac{\Vert {\varvec{u}}_0\Vert ^2}{(1+Ck\mu (\delta ))^m}+\displaystyle \frac{CS^2(h)}{\mu ^2(\delta )}\Vert {\varvec{f}}\Vert ^{2}_{-1,r'} \left[ 1-(1+Ck\mu (\delta ))^{-m}\right] ~ \forall m\ge 0. \nonumber \\ \end{aligned}$$
(4.9)

If moreover \(\displaystyle \frac{S(h)}{\mu (\delta )}=0(1)\), then

$$\begin{aligned} \Vert {\varvec{u}}_h^{\epsilon m}\Vert ^2\le & {} C\Vert {\varvec{u}}_0\Vert ^2+C\Vert {\varvec{f}}\Vert ^{2}_{-1,r'}~~\forall m\ge 0. \end{aligned}$$
(4.10)

For \(p=1,2,\ldots ,n\), one has

$$\begin{aligned}&\sum _{m=p}^{n}\Vert {\varvec{u}}_h^{\epsilon m}-{\varvec{u}}_h^{\epsilon m-1}\Vert ^2+k \mu (\delta )\sum _{m=p}^{n}\Vert {\varvec{D}}({\varvec{u}}_h^{\epsilon m})\Vert ^2 +2\nu k \sum _{m=p}^{n}\int _\Omega |{\varvec{D}}({\varvec{u}}_h^{\epsilon m-1})|^{r-2}|{\varvec{D}}({\varvec{u}}_h^{\epsilon m})|^2dx\nonumber \\&\le C\Vert {\varvec{u}}_0\Vert ^2+C\Vert {\varvec{f}}\Vert ^{2}_{-1,r'}+CS(h)(n-p+1)k\Vert {\varvec{f}}\Vert ^{2}_{-1,r'}. \end{aligned}$$
(4.11)

Proof

Take \({\varvec{v}}_h={\varvec{u}}_h^{\epsilon m}\) and \(q_h=p_h^{\epsilon m}\) using the fact \(-b({\varvec{u}}_h^{\epsilon m},p_h^{\epsilon m})=\epsilon c(p_h^{\epsilon m},p_h^{\epsilon m})\ge 0,\) together with \(2(a-b,a)=|a|^2-|b|^2+|a-b|^2\), one obtains

$$\begin{aligned}&\Vert {\varvec{u}}_h^{\epsilon m}\Vert ^2-\Vert {\varvec{u}}_h^{\epsilon m-1}\Vert ^2+\Vert {\varvec{u}}_h^{\epsilon m}-{\varvec{u}}_h^{\epsilon m-1}\Vert ^2+2k \mu (\delta )\Vert {\varvec{D}}({\varvec{u}}_h^{\epsilon m})\Vert ^2\nonumber \\&+2\nu k\int _\Omega |{\varvec{D}}({\varvec{u}}_h^{\epsilon m-1})|^{r-2}|{\varvec{D}}({\varvec{u}}_h^{\epsilon m})|^2dx \le 2k\langle {\varvec{f}},{\varvec{u}}_h^{\epsilon m}\rangle \end{aligned}$$
(4.12)

The right hand side of (4.12) is treated using (4.7) and (4.8) and (2.4) as follows

$$\begin{aligned} 2k({\varvec{f}},{\varvec{u}}_h^{\epsilon m})\le & {} 2Ck\Vert {\varvec{f}}\Vert _{-1,r'}\Vert \nabla {\varvec{u}}_h^{\epsilon m}\Vert _{r}\\\le & {} 2CS(h)k\Vert {\varvec{f}}\Vert _{-1,r'}\Vert {\varvec{u}}_h^{\epsilon m}\Vert _{L^r}\\\le & {} 2CS(h)k\Vert {\varvec{f}}\Vert _{-1,r'}\Vert \nabla {\varvec{u}}_h^{\epsilon m}\Vert \\\le & {} 2CS(h)k\Vert {\varvec{f}}\Vert _{-1,r'}\Vert {\varvec{D}}({\varvec{u}}_h^{\epsilon m})\Vert , \end{aligned}$$

which combined with (4.12) and Young’s inequality leads to

$$\begin{aligned}&\Vert {\varvec{u}}_h^{\epsilon m}\Vert ^2-\Vert {\varvec{u}}_h^{\epsilon m-1}\Vert ^2+\Vert {\varvec{u}}_h^{\epsilon m}-{\varvec{u}}_h^{\epsilon m-1}\Vert ^2+k \mu (\delta )\Vert {\varvec{D}}({\varvec{u}}_h^{\epsilon m})\Vert ^2\nonumber \\&\quad +2\nu k\int _\Omega |{\varvec{D}}({\varvec{u}}_h^{\epsilon m-1})|^{r-2}|{\varvec{D}}({\varvec{u}}_h^{\epsilon m})|^2dx \le \displaystyle \frac{CS^2(h)k}{\mu (\delta )}\Vert {\varvec{f}}\Vert ^{2}_{-1,r'}. \end{aligned}$$
(4.13)

Applying (2.4), and (2.3)–(4.13) and dropping some positive terms we find

$$\begin{aligned} \Vert {\varvec{u}}_h^{\epsilon m}\Vert ^2 \le \displaystyle \frac{1}{1+Ck\mu (\delta )}\Vert {\varvec{u}}_h^{\epsilon m-1}\Vert ^2+\displaystyle \frac{CS^2(h)k}{\mu (\delta )(1+Ck\mu (\delta ))}\Vert {\varvec{f}}\Vert ^{2}_{-1,r'}. \end{aligned}$$
(4.14)

Using (4.14) recursively, we find

$$\begin{aligned} \Vert {\varvec{u}}_h^{\epsilon m}\Vert ^2\le & {} \displaystyle \frac{1}{(1+Ck\mu (\delta ))^m}\Vert {\varvec{u}}_h^{\epsilon 0}\Vert ^2+\displaystyle \frac{CS^2(h)k}{\mu (\delta )}\Vert {\varvec{f}}\Vert ^{2}_{-1,r'}\sum _{i=1}^{m}\displaystyle \frac{1}{(1+Ck\mu (\delta ))^i}\nonumber \\\le & {} \displaystyle \frac{\Vert {\varvec{u}}_{h}^{\epsilon 0}\Vert ^2}{(1+Ck\mu (\delta ))^m}+\displaystyle \frac{CS^2(h)}{\mu ^2(\delta )}\Vert {\varvec{f}}\Vert ^{2}_{-1,r'} \left[ 1-(1+Ck\mu (\delta ))^{-m}\right] , \end{aligned}$$
(4.15)

which proves (4.9). (4.10) is a direct consequence of (4.9).

Now adding up (4.13) for \(m=p,\ldots ,n\), and dropping some positive terms yields;

$$\begin{aligned}&\sum _{m=p}^{n}\Vert {\varvec{u}}_h^{\epsilon m}-{\varvec{u}}_h^{\epsilon m-1}\Vert ^2+k \mu (\delta )\sum _{m=p}^{n}\Vert {\varvec{D}}({\varvec{u}}_h^{\epsilon m})\Vert ^2 +2\nu k \sum _{m=p}^{n}\int _\Omega |{\varvec{D}}({\varvec{u}}_h^{\epsilon m-1})|^{r-2}|{\varvec{D}}({\varvec{u}}_h^{\epsilon m})|^2dx\nonumber \\&\quad \le \Vert {\varvec{u}}_h^{\epsilon p-1}\Vert ^2+\displaystyle \frac{CS^2(h)(n-p+1)k}{\mu (\delta )}\Vert {\varvec{f}}\Vert ^{2}_{-1,r'} \end{aligned}$$

which implies (4.11). \(\square \)

Step 2: weak convergence/passage to the limit   Let \({\varvec{u}}_{hk}^{\epsilon }\in \mathcal {C}^0([0,T],{\varvec{X}})\) be affine in each subinterval \([t_{m-1}, t_m]\) with \({\varvec{u}}_{hk}^{\epsilon }(t_m)={\varvec{u}}_{h}^{\epsilon m}\) for \(1\le m\le N\). Let \({\varvec{u}}_{hk}^{\epsilon r}\), \({\varvec{u}}_{hk}^{\epsilon l}\) be the piecewise constant function such that

$$\begin{aligned} {\varvec{u}}_{hk}^{\epsilon r}|_{[t_{m-1},t_m[}={\varvec{u}}_{h}^{\epsilon m} \quad \quad {\varvec{u}}_{hk}^{\epsilon l}|_{]t_{m-1},t_m]}={\varvec{u}}_{h}^{\epsilon m-1}\,. \end{aligned}$$

Next, from the estimations obtained in Lemma 4.1, one gets

$$\begin{aligned} \begin{array}{lll} {\varvec{u}}_{hk}^{\epsilon },\;\;{\varvec{u}}_{hk}^{\epsilon r},\;\;{\varvec{u}}_{hk}^{\epsilon l}\;\;\text {remain in a bounded set of}\;\; L^\infty (t_{m-1},t_m;{\varvec{L}}^2(\Omega ))\\ {\varvec{u}}_{hk}^{\epsilon r}\;\;\text {is remain in a bounded set of}\;\; L^2(t_{m-1},t_m;{\varvec{W}}_0^{1,r}(\Omega )). \end{array} \end{aligned}$$
(4.16)

Furthermore, from (4.11), we have

$$\begin{aligned} \begin{array}{l} \Vert {\varvec{u}}_{hk}^{\epsilon r}-{\varvec{u}}_{hk}^{\epsilon l} \Vert _{L^2(t_{m-1},t_m;{\varvec{L}}^2(\Omega ))}\le C k^{1/2}\\ \Vert {\varvec{u}}_{hk}^{\epsilon }-{\varvec{u}}_{hk}^{\epsilon r} \Vert _{L^2(t_{m-1},t_m;{\varvec{L}}^2(\Omega ))}\le C k^{1/2} \end{array} \end{aligned}$$
(4.17)

Then we can extract a subsequence \(k'\subset k\) still denoted k such that

$$\begin{aligned}\begin{array}{l} {\varvec{u}}_{hk}^{\epsilon r} \rightarrow {\varvec{u}}_{h}^{\epsilon r} \;\;\text {weakly* in}\;\;L^\infty (t_{m-1},t_m;{\varvec{L}}^2(\Omega ))\\ {\varvec{u}}_{hk}^{\epsilon l} \rightarrow {\varvec{u}}_{h}^{\epsilon l} \;\;\text {weakly* in }\;\;L^\infty (t_{m-1},t_m;{\varvec{L}}^2(\Omega ))\\ {\varvec{u}}_{hk}^{\epsilon } \rightarrow {\varvec{u}}_{h}^{\epsilon } \;\;\text {weakly* in }\;\;L^\infty (t_{m-1},t_m;{\varvec{L}}^2(\Omega ))\\ {\varvec{u}}_{hk}^{\epsilon r} \rightarrow {\varvec{u}}_{h}^{\epsilon r} \;\;\text {weakly in }\;\;L^2(t_{m-1},t_m;{\varvec{W}}_0^{1,r}(\Omega )) \end{array} \end{aligned}$$

and from (4.17), one obtains \({\varvec{u}}_{h}^{\epsilon r}={\varvec{u}}_{h}^{\epsilon l}={\varvec{u}}_{h}^{\epsilon }\).

Note that \({\varvec{u}}_{hk}^{\epsilon r}\), \({\varvec{u}}_{hk}^{\epsilon l}\) and \({\varvec{u}}_{hk}^{\epsilon }\) verify:

$$\begin{aligned} \begin{array}{lll} (\partial _t{\varvec{u}}_{hk}^{\epsilon },{\varvec{v}}_h) +\nu (|{\varvec{D}}({\varvec{u}}_{hk}^{\epsilon l})|^{r-2}{\varvec{D}}({\varvec{u}}_{hk}^{\epsilon r}),{\varvec{D}}({\varvec{v}}_h))+ \mu (\delta )({\varvec{D}}({\varvec{u}}_{hk}^{\epsilon r}),{\varvec{D}}({\varvec{v}}_h))\\ +\displaystyle \frac{1}{\epsilon } (\rho _h(\mathrm{div }{\varvec{u}}_{hk}^{\epsilon r}),\rho _h(\mathrm{div }{\varvec{v}}_h))= ({\varvec{f}},{\varvec{v}}_h)\;\;\forall {\varvec{v}}_h\in {\varvec{X}}_h \end{array} \end{aligned}$$
(4.18)

where \(\rho _h\) is the orthogonal projection of \(L^2(\Omega )\) onto \(Q_h\). The weak convergence above allows us to pass to the limit in all bilinear terms in (4.18). But as far as the nonlinear term is concerned, we advise the reader to see the results in [25], where a similar expression has been analyzed.

Note that establishing convergence of the pressure is more delicate because it involves convergence of the time derivative of the velocity whose proof is fairly long and intricate; cf. Lions [25] and Temam [23].

4.2 Numerical Simulations

In this subsection we check through numerical simulations the robustness of the algorithms presented in Sect. 4.1. We also elucidate certain properties of non-Newtonian flow studied.

In all examples discussed in this paragraph we assert the performance of the time stepping algorithms formulated in (4.4) and (4.5). The result in Theorem 4.1 shows that the time dependent problem behaves like the stationary problem for long term behavior. This property is check below with two numerical simulations. The divergence-free constraint is imposed via penalization with penalty parameter \(\epsilon \). But is should be pointed out that the analysis of Sect. 3 clearly indicate in the discrete (and continuous) setting that the penalized and normal solutions are identical in some senses as along as \(\epsilon \) goes to zero. In all the examples presented, the velocity and pressure will be approximated by \(P2-P1\) element on uniform mesh. The systems of equations obtained via (4.4) and (4.5) are solved by extending the Matlab code proposed in [32], which is a direct solver.

We note that (4.4) and (4.5) are time stepping procedures, hence one needs initial condition to generate a sequence of solutions. The choice of initial condition in this case is less complicated because the variational problem (4.1) has a unique solution for all values of r in \((1,\infty )\). So based on Theorem 4.1, the convergence is guaranteed for all initial solution \({\varvec{u}}_0\). Hence in order to consolidate the convergence of (3.1), we suggest the solution of Stokes equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \nu ({\varvec{D}}({\varvec{u}}_h),{\varvec{D}}({\varvec{v}}_h))-b({\varvec{v}}_h,p_h)= ({\varvec{f}},{\varvec{v}}_h)\;\;\forall {\varvec{v}}_h\in {\varvec{X}}_h\\ b({\varvec{u}}_h,q_h)+\epsilon c(p_h,q_h)=0,\;\;\forall q_h\in Q_h, \end{array}\right. } \end{aligned}$$
(4.19)

as initial condition for (4.4) and (4.5).

Table 1 Error on the velocity field using (4.5) and \(\varepsilon =10^{-3}\)
Table 2 Error on the velocity field using (4.4) and \(\varepsilon =10^{-3}\)
Table 3 Velocity error with \(W^{1,3}\) and (4.5)
Table 4 Rate of convergence for the velocity with \(W^{1,3}\) and (4.5)
Table 5 \(\varepsilon \)-rate of convergence for the velocity with \(W^{1,3}\) and (4.5)

First, the convergence of the algorithms (4.4) and (4.5) are considered with a view of computing the rate of convergence to consolidate the theoretical results obtained in Sect. 3.3. Next, the dependence of \(\varepsilon \) in the convergence is tested. Thirdly, the performance and qualitative features of the algorithm (4.5) are tested via two classical planar benchmark problems: the driven cavity flow and a flow past a circular cylinder. For these problems, the velocity profile inside the domain of interest is very important. Hence the full set of equations must be solved. These problems are popular in computational analysis of non-Newtonian flows because; the abrupt contraction is a common feature of many polymer with applications, the numerical difficulties that one encounters, the possibility of observing singularities at the re-entrant corner, etc. In all simulations below, we have taken \(T=10\).

Fig. 1
figure 1

Driven cavity description

Fig. 2
figure 2

Non-Newtonian Stokes flow with r \(=\) 3, \(\nu =0.1\)

Fig. 3
figure 3

Non-Newtonian Stokes flow with r \(=\) 3/2, \(\nu =0.1\)

Fig. 4
figure 4

\(\nu =0.01\), Non-Newtonian Stokes flow with r \(=\) 3/2

Fig. 5
figure 5

\(\nu =0.01\), Non-Newtonian Stokes flow with r \(=\) 3

4.2.1 Convergence Check

In this first set of numerical simulations, we would like to test the convergence result obtained in Sect. 3 by; (a) computing the rate of convergence with \({\varvec{L}}^r\) and \({\varvec{W}}^{1,r}\)-norms, (b) showing that the error is uniform with respect to \(\epsilon \).

For this purpose, we let \(\nu =0.4\), the exterior force to be unity and \(\Omega =(0,1)^2\). It is also assumed that the velocity at the boundary is zero. Since we do not have the exact solution of the model problem, we compute the error using \({\varvec{L}}^r\)-norm and \({\varvec{W}}^{1,r}\)-norm of the difference of the solution obtained for \(h=\frac{1}{N/2}\) and \(h'=\frac{1}{N'/2}\), where \(N+1\) is the number of grid points of [0, 1] and \(N'+1=2(N+1)-1\). For these simulations we take \(r=3\), \(k=1/100\), \(\mu (\delta )=1/500\). The results are reported in Tables 1, 2, 3 and 4. Of course the rate of convergence is computed using the formula

$$\begin{aligned} \alpha =\displaystyle \frac{\log e_2/ e_1}{\log h_2/ h_1}. \end{aligned}$$

In Tables 1 and 2, the convergence of the algorithms (4.5) and (4.4) are tested. It is apparent that we have a superior rate of convergence in Table 1 for both \(L^3\) and \(W^{1,3}\) norms respectively. These results also demonstrate that the convergence is sub-optimal. Furthermore, in term of CPU time, for a mesh of 25 elements per side, the convergence with algorithm (4.5) is faster than the convergence with algorithm (4.4). We believe the good behavior of (4.5) is due to the coercive term \(\mu (\delta )({\varvec{D}}({\varvec{u}}_h^{\epsilon ,m}),{\varvec{D}}({\varvec{v}}_h))\) introduced. Note also that the stiffness matrix obtained with the scheme (4.4) has a complicated structure compare to the one obtained with the scheme (4.5). Having obtained better results with algorithm (4.5), we turn to the sensitivity of the error with respect to \(\epsilon \) with that algorithm. In Table 3, the error with respect to \(W^{1,3}\) is computed for various values of \(\epsilon \). It is manifest that, the error is uniform with respect to \(\epsilon \). In Table 4, the rate of convergence for values of \(\epsilon \) are calculated. One observes that these values of \(\epsilon \), the rate is almost the same. In Table 5, we fix the discretization parameter h, and compute the rate of convergence with respect to \(\epsilon \). It should be observed that Tables 4 and 5 are obtained from Table 3. Hence from Tables 3, 4 and 5, one concludes the numerical results are in agreement with the results obtained in Sects. 3.2 or 3.3.

4.2.2 Driven Cavity Flow

This classical problem has become a standard benchmark for assessing the performance of algorithms for many flow problems and it has been studied in the context of Navier Stokes in [3335]. The configuration is as depicted in Fig. 1. It corresponds to a flow in a box \(\Omega =(0,1)^2\), with the boundary \(\partial \Omega =\Gamma \cup S\) with

$$\begin{aligned} \Gamma= & {} \{(0,y)/0<y<1\}\cup \{(x,0)/0<x<1\}\cup \{(1,y)/0<y<1\}\\ S= & {} \{(x,1)/0<x<1\}. \end{aligned}$$

We are interested in plotting the stream function, the velocity and pressure distributions for various value of r, and \(\nu \), when \(k=10^{-2}\), \(\mu (\delta )=10^{-4}\), \(\epsilon =10^{-4}\) and \(h=1/(N+1)\) with \(N=256\) being the number of elements in the triangulation. For \(\nu =0.1\), the initial configuration is obtained by solving (4.19). Figure 2 represents the flow when \(r=3\), while Fig. 3 is the situation when \(r=3/2\). No singularity is developed as it is expected from the physics of the problem. Similar patterns are observed when \(\nu =0.01\), see Figs. 4 and 5.

4.2.3 Flow Past a Circular Cylinder

This is a popular test problem and has been simulated by many researchers [35, 36]. The geometry is given in Fig. 6, and we are interested in the motion of the fluid around the cylinder. In nature this is a time dependent problem, and consequently the non-dimensionalized version of algorithm (4.5) should be able to predict the evolution of the flow. A cylinder is immersed in an incompressible flow, the center of the circle is situated at the coordinate (0.25, 0.2) and the diameter is 0.1. The inflow and outflow conditions (on left/right boundaries) are

$$\begin{aligned} {\left\{ \begin{array}{ll} u_1=0.3/0.41^2*4y(0.41-y),\,\, u_2=0\quad \text{ on }\quad \Gamma _{in}=\{0\}\times (0, 0.41),\\ u_1=0.3/0.41^2*4y(0.41-y),\,\, u_2=0\quad \text{ on }\quad \Gamma _{out}=\{2.2\}\times (0, 0.41)\,. \end{array}\right. } \end{aligned}$$
(4.20)

Of course it is assumed that on the other part of the boundary of \(\Omega \), homogeneous boundary conditions are prescribed. For this test problem, the viscosity is \(\nu =10^{-3}\), \(k=10^{-2}\), \(\mu (\delta )=10^{-2}\), \(\epsilon =10^{-4}\). The initial solution/configuration is obtained by solving (4.19). Figures 7 and 8 are the Non-newtonian Stokes problem with \(r=1.5\) and \(r=3\) respectively. The effect of the nonlinearity is marginal and the features predicted by the physics are captured.

Fig. 6
figure 6

Geometry and boundary conditions

Fig. 7
figure 7

Non-Newtonian Stokes flow around a cylinder with r \(=\) 3/2

Fig. 8
figure 8

Non-Newtonian Stokes flow around a cylinder with r \(=\) 3

5 Conclusions

In this article, we have first discussed the uniform convergence with respect to the penalized parameter \(\epsilon \) of the finite element solution. Next, we have formulated, and implemented two solution strategies. Our results on uniform convergence generalize and complement those obtained in [17, 18]. Here we take systematic advantage of sufficient conditions for existence of solutions. As far as the implementation of the finite element presented goes, we have adapted the well known methodology consisting to associate to a stationary problem an initial value problem in which the focus is on the behavior of the solution of the later problem when the time is big enough. But in order to improve the rate of convergence, we have added a stabilizing term to the initial value problem (numerical computations confirm the predictions). This approach leads naturally to solution methods based on time discretization; it has also an advantage of being easily implementable, but much progress has to be made for a systematic way of choosing the initial flow.