1 Introduction

Darcy’s law

$$\begin{aligned} {\varvec{u}} = - \frac{{\varvec{K}}}{\mu }\nabla p, \end{aligned}$$

with the permeability tensor \( {\varvec{K}} \) and the viscosity coefficient \(\mu \), describes the linear relationship between the velocity \({\varvec{u}}\) of the creep flow and the gradient of the pressure p, which is valid when the Darcy velocity \({\varvec{u}}\) is extremely small [5]. Forchheimer in [14] carried out flow experiments and pointed out that when the velocity is relatively high, Darcy’s law should be replaced by the so-called Darcy–Forchheimer (DF) equation by adding a quadratic nonlinear term to the velocity, shown as follows:

$$\begin{aligned} \frac{\mu }{\rho }{{\varvec{K}}^{ - 1}}{\varvec{u}} + \frac{\beta }{\rho }\left| {\varvec{u}} \right| {\varvec{u}} + \nabla p = \varvec{0}, \end{aligned}$$
(1.1)

where \( \rho \) and \( \beta \) represent the density of the fluid and its dynamic viscosity, respectively. The parameter \( \beta \) is also referred to as the Forchheimer number, which controls the strength of nonlinearity. A theoretical derivation of the Darcy–Forchheimer equation (1.1) can be found in [26]. Equation (1.1) coupled with the conservation law

$$\begin{aligned} \text {div}\, {\varvec{u}} = g \end{aligned}$$
(1.2)

are usually called Darcy–Forchheimer model.

In recent years, many numerical methods of the Darcy–Forchheimer model have been developed. Girault and Wheeler in [15] proved the existence and uniqueness of the solution of the Darcy–Forchheimer model (1.1)–(1.2) by proving the nonlinear operator \({\mathcal {A}}\left( {\varvec{v}} \right) = \frac{\mu }{\rho }{{\varvec{K}}^{ - 1}}{\varvec{v}} + \frac{\beta }{\rho }\left| {\varvec{v}} \right| {\varvec{v}}\) is monotone, coercive and hemi-continuous, and establishing an appropriate inf-sup condition. Then they considered mixed finite element methods by approximating the velocity and the pressure by piecewise constant and nonconforming Crouzeix–Raviart (CR) elements, respectively. They proved a discrete inf-sup condition and the convergence of the mixed finite element scheme. They also proposed a Peaceman–Rachford (PR) type iterative method to solve the discretized nonlinear system and proved convergence of this iterative solver. In the PR iteration, the nonlinear equation can be decoupled with the divergence constraint and solved in a closed form; see Sect. 4 for details. López et al. in [17] carried out numerical tests of the methods proposed in [15], and made a comparative study between Newton’s method and the PR iterative method. They pointed out that Newton’s method is not competitive with the PR iteration. In each iteration, Newton’s method needs to evaluate a Jacobian and solves a linear saddle point system, but the PR iteration computes an intermediate solution for a decoupled nonlinear equation and then solves a simplified linear saddle point system. The cost of solving the decoupled nonlinear equation can be negligible in comparison with the Jacobian evaluation. Furthermore the PR iteration required fewer iterations to converge than Newton’s method with the same initial guess; see [17] for details.

Park in [21] developed a mixed finite element method with a semi-discrete scheme for the time dependent Darcy–Forchheimer model. Pan and Rui in [20] gave a mixed element method for the Darcy–Forchheimer model based on the Raviart–Thomas (RT) element or the Brezzi–Douglas–Marini (BDM) element approximation of the velocity and piecewise constant (P0) approximation of the pressure. Rui and Pan in [24] proposed a block-centered finite difference method for the Darcy–Forchheimer model, which was thought of as the lowest-order RT-P0 mixed element with proper quadrature formula. Rui et al. in [25] presented a block-centered finite difference method for the Darcy–Forchheimer model with variable Forchheimer number \(\beta ({\varvec{x}})\). Wang and Rui in [30] constructed a stabilized CR element for the Darcy–Forchheimer model. Rui and Liu in [23] introduced a two-grid block-centered finite difference method for the Darcy–Forchheimer model. Salas et al. in [27] presented a theoretical study of the mixed finite element method proposed in [17], and showed the well-posedness and convergence.

Most of work mentioned above mainly focus on the discretization of the Darcy–Forchheimer model. Except the PR iteration presented in [15], no other work concentrates on fast solvers of the discretized nonlinear saddle point system which will be the topic of this paper. Multigrid method is one of the most efficient methods on solving the linear and nonlinear elliptic systems. It should be clarified that for nonlinear problems we no longer have a simple linear residual equation, which is the most significant difference between linear and nonlinear systems. The multigrid scheme we used here is the most commonly used nonlinear version of multigrid. It is called the full approximation scheme (FAS) [9] because the problem in the coarse grid is solved for the full approximation rather than the correction; see Sect. 5 for details.

We shall use piecewise constant (P0) and continuous piecewise linear polynomial (P1) to discretize the velocity and the pressure, respectively. We refer to [27] for the convergence analysis of this scheme and focus on fast solvers in our study. We shall apply FAS to construct an efficient V-cycle multigrid method for the nonlinear Darcy–Forchheimer model and demonstrate the efficiency of our multigrid method. Similar application of FAS to a nonlinear saddle point system (for Cahn–Hillard type equations) can be found in [4, 31]. Recall that the success of multigrid method relies on two ingredients: the high frequency can be damped efficiently by the smoother, and the low frequency can be well approximated by the coarse grid correction. Notice that for saddle point systems, both smoothing and coarse grid corrections can easily violate the constraint [11]. The main difficulty of developing robust and effective multigrid methods for the saddle point system is to design an effective smoother with the consideration of the constraint \(\text {div}\, {\varvec{u}} = g\). We shall use the Peaceman–Rachford iteration developed in [15] as a smoother since the nonlinearity can be handled efficiently and the constraint is always satisfied after solving a linear saddle point system. To enforce the constraint after the coarse grid correction, we also project the correction into the divergence free subspace. This is in the sprit of the B-S smoother developed in [7] for the Stokes equation except here we are dealing with a harder nonlinear equation instead of a linear Stokes equation.

The most relevant work is [17] and our improvement are:

  1. 1.

    We reduce the linear saddle point system into a SPD system and demonstrate the efficiency of our approach.

  2. 2.

    We report a better choice of the splitting parameter \(\alpha \) for decoupling the nonlinearity from the constraint rather than the suggested value \( \alpha = 1\) in [17] for different values of the Forchheimer number \( \beta \), and show the advantage of our choice by comparing the number of iterations and CPU time.

  3. 3.

    We carry out some experiments to show the efficiency of our multigrid solver. Our method is convergent with a rate independent of the mesh size and the Forchheimer number and with a nearly linear computational cost. Notice that it is not easy to construct a fast solver robust to a critical parameter, see, for example, a linear Stokes-type equation [18, 19].

The remainder of this article is organized as follows: The model problem is demonstrated in Sect. 2. The mixed weak formulation and the discrete weak formulation are presented in Sect. 3. The PR iteration and an efficient solver for the linear saddle point systems are posted in Sect. 4. A V-cycle multigrid scheme by applying FAS for the nonlinear problem is constructed in Sect. 5. Some numerical experiments using our multigrid method are carried out in Sect. 6 to verify that the efficiency of our method in comparison with solving this nonliear problem using the other iterative methods. Finally, conclusions and further ideas are presented in Sect. 7.

2 The Problem and Notation

We consider the steady Darcy–Forchheimer flow of a single phase fluid in a porous medium in a two-dimensional bounded domain \( \Omega \), with Lipschitz continuous boundary \(\partial \Omega \):

$$\begin{aligned} \frac{\mu }{\rho }{{\varvec{K}}^{ - 1}}{\varvec{u}} + \frac{\beta }{\rho }\left| {\varvec{u}} \right| {\varvec{u}} + \nabla p = {\varvec{f}}\quad \text {in}\,\,\Omega , \end{aligned}$$
(2.1)

with the divergence constraint

$$\begin{aligned} \text {div}\, {\varvec{u}} = g\quad \text {in}\,\,\Omega , \end{aligned}$$
(2.2)

and Neumann boundary condition,

$$\begin{aligned} {\varvec{u}} \cdot {\varvec{n}} = {g_N}\quad \text {on}\,\,{\partial \Omega }, \end{aligned}$$
(2.3)

where \( {\varvec{u}} \) and p are the velocity vector and the pressure, respectively; \( \mu \), \( \rho \) and \( \beta \) are given positive constants that represent the viscosity of the fluid, its density and its dynamic viscosity, respectively; \( \vert \cdot \vert \) denotes the Euclidean vector norm \( \left| {\varvec{u}} \right| ^2 = {\varvec{u}} \cdot {\varvec{u}} \), \( {\varvec{n}} \) is the unit exterior normal vector to the boundary of the given domain \( \Omega \); \( {\varvec{K}} \) is the permeability tensor, assumed to be uniformly positive definite and bounded. According to the divergence theorem, g and \( g_N \) are given functions satisfying the compatibility condition

$$\begin{aligned} \int _\Omega {g\left( {\varvec{x}} \right) } \, \mathrm{d}{\varvec{x}} = \int _{\partial \Omega } {{g_N}\left( \sigma \right) } \, \mathrm{d}\sigma . \end{aligned}$$
(2.4)

We use the standard notation of the Sobolev spaces and the associated norms, see e.g.[1].

3 The Weak Formulation

Following [15], we define the function spaces as follows:

$$\begin{aligned} X= & {} {L^3}{\left( \Omega \right) ^2},\\ M= & {} W^{1,{3\over 2}}{\left( \Omega \right) } \cap L_0^2\left( \Omega \right) , \end{aligned}$$

where the zero mean value condition

$$\begin{aligned} L_0^2\left( \Omega \right) = \left\{ {v \in {L^2}\left( \Omega \right) :\int _\Omega {v\left( {\varvec{x}} \right) \, \mathrm{d}{\varvec{x}} = 0} } \right\} , \end{aligned}$$

is added because p is only defined by (2.1)–(2.3) up to an additive constant. Given \( {\varvec{f}} \in {L^3}{\left( \Omega \right) ^2}\), \( g\in {L^{\frac{6}{5}}}{\left( \Omega \right) }\), and \( g_N \in {L^{\frac{{3}}{{2}}}}\left( {\partial \Omega } \right) \), the variational formulation of (2.1)–(2.3) is: find a pair \( \left( {\varvec{u}},p \right) \) in \( X \times M \) such that

$$\begin{aligned}&\frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}{\varvec{u}} } \right) \cdot {\varvec{\varphi }} \, \, \mathrm{d}{\varvec{x}}} + \frac{\beta }{\rho }\int _\Omega {\left| {\varvec{u}} \right| \left( {{\varvec{u}} \cdot {\varvec{\varphi }} } \right) \, \mathrm{d}{\varvec{x}}} \nonumber \\&\quad + \int _\Omega {\nabla p \cdot {\varvec{\varphi }} \, \, \mathrm{d}{\varvec{x}}} = \int _\Omega {{\varvec{f}} \cdot {\varvec{\varphi }} \,\, \mathrm{d}{\varvec{x}}} , \quad \forall {\varvec{\varphi }} \in X, \end{aligned}$$
(3.1)
$$\begin{aligned}&\int _\Omega {\nabla q \cdot {\varvec{u}}\,\, \mathrm{d}{\varvec{x}}} = - \int _\Omega {gq \,\, \mathrm{d}{\varvec{x}}} + \int _{\partial \Omega } {{g_N}q\, \, \mathrm{d}{\varvec{x}}} ,\quad \forall q \in M. \end{aligned}$$
(3.2)

The variational formulation (3.1)–(3.2) and the original problem (2.1)–(2.3) are equivalent by using the Green’s formula:

$$\begin{aligned} \int _\Omega {{\varvec{v}} \cdot \nabla q\,dx} = - \int _\Omega {q \, {\mathrm {div}} \, {\varvec{v}}\,dx} + {\left\langle {q,{\varvec{v}} \cdot {\varvec{n}}} \right\rangle _{\partial \Omega }},\quad \forall q \in M,\forall {\varvec{v}} \in H, \end{aligned}$$
(3.3)

where

$$\begin{aligned} H = \left\{ {{\varvec{v}} \in {L^3}{{\left( \Omega \right) }^2}:{\mathrm {div}}\,{\varvec{v}} \in { L ^{\frac{{6}}{{5}}}}\left( \Omega \right) } \right\} . \end{aligned}$$

In [15], Girault and Wheeler showed that if the given functions g and \( g_N \) satisfy the compatibility condition (2.4), then the problem has a unique solution \( \left( {\varvec{u}},p \right) \) in \( X \times M \).

Let \(\Omega \) be a polygon in two dimensions which can be completely triangulated by triangles. Let \( {\mathcal {T}}_{1} \) be a triangulation of \( \Omega \), and the triangulations \( {\mathcal {T}}_{{ k }} \left( k = 2,3,\ldots \right) \) be obtained form \( {\mathcal {T}}_{1} \) via regular subdivision, i.e. edge midpoints in \( {\mathcal {T}}_{k-1} \) are connected by new edges to form \( {\mathcal {T}}_{{ k }} \). Therefore, \( {\mathcal {T}}_{k} \) is a family of conforming triangulations of \( {\overline{\Omega }} \),

$$\begin{aligned} {\overline{\Omega }} = \bigcup \limits _{T \in {\mathcal {T}}_{{ k }}} T\quad \text {for}\;k = 1,2,3, \ldots , \end{aligned}$$

The family \( {\mathcal {T}}_{{ k }} \) is shape regular in the sense of Ciarlet [13].

We discretize \({\varvec{u}}\) and p in different finite element spaces. The velocity \({\varvec{u}}\) is approximated in the following space:

$$\begin{aligned} {X_k} = \left\{ {{\varvec{v}} \in {L^2}{{\left( \Omega \right) }^2}:\forall T \in {\mathcal {T}}_{{ k }},{\varvec{v}}{\vert _{ T }} \in {\mathbb {P}}^{\mathrm {2}}_{\mathrm {0}}} \right\} , \end{aligned}$$
(3.4)

and the pressure p is approximated in the following space:

$$\begin{aligned} {M_k} = {Q_k} \cap L_0^2\left( \Omega \right) , \end{aligned}$$
(3.5)

where \({\mathbb {P}}_{ m }\) denotes the space of polynomials of degree m, and \( Q_k \) is the linear finite element space

$$\begin{aligned} {Q_k} = \left\{ {{q} \in {C^0}{{\left( {\bar{\Omega }} \right) }}:\forall T \in {\mathcal {T}}_{{ k }}, q {\vert _{ T }} \in {\mathbb {P}}_{\mathrm {1}}}\right\} . \end{aligned}$$

With these spaces, we can have the k-th level discrete formulation of the problem (3.1)–(3.2):

$$\begin{aligned}&\frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}{\varvec{u}}_{k} } \right) \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}}+ \frac{\beta }{\rho }\int _\Omega {\left| {\varvec{u}}_{k} \right| \left( {{\varvec{u}}_{k} \cdot {\varvec{\varphi }}_{k} } \right) \, \mathrm{d}{\varvec{x}}}\nonumber \\&\quad + \sum \limits _{ T \in {\mathcal {T}}_{{ k }}} { \int _T {\nabla p_{k} \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}}} = \int _\Omega {{\varvec{f}} \cdot {\varvec{\varphi }}_{k} \,\, \mathrm{d}{\varvec{x}}} ,\quad \forall {\varvec{\varphi }}_{k} \in X_{k}, \end{aligned}$$
(3.6)
$$\begin{aligned}&\sum \limits _{ T \in {\mathcal {T}}_{{ k }}} {\int _T {\nabla q_{k} \cdot {\varvec{u}}_{k}\,\, \mathrm{d}{\varvec{x}}}} = - \int _\Omega {gq_{k} \,\, \mathrm{d}{\varvec{x}}} + \int _{\partial \Omega } {{g_N}q_{k}\, \, \mathrm{d}{\varvec{x}}} ,\quad \forall q_{k} \in M_{k}. \end{aligned}$$
(3.7)

By our construction,

$$\begin{aligned} h_{k - 1} = 2h_{k}, \quad \text {for} \; k = 2,3,\ldots . \end{aligned}$$

Note that \( {\mathcal {T}}_{{ k }} \) are nested meshes, and thus

$$\begin{aligned} X_{k - 1} \subset X_{k}, \quad M_{k - 1} \subset M_{k}. \end{aligned}$$

In [27], the authors demonstrated that the discrete problem (3.6)–(3.7) has a unique solution. Moreover, if \({\mathcal {T}}_{ h }\) is shape regular with mesh size h and the solution \({\varvec{u}}\) belongs to \({W^{1,4}}{\left( \Omega \right) }\) and p belongs to \({W^{2,\frac{3}{2}}}{\left( \Omega \right) }\), then the following error estimations are obtained in [27, Theorem 4.10]:

$$\begin{aligned} \left\| {{\varvec{u}} - {\varvec{u}}_{h}} \right\| _{{L^{2}}\left( \Omega \right) }\le & {} Ch{\left| {\varvec{u}} \right| _{{W^{1,4}}\left( \Omega \right) }}, \end{aligned}$$
(3.8)
$$\begin{aligned} {{\left\| {\nabla \left( {p - {p_h}} \right) } \right\| } _{{L^{\frac{3}{2}}}\left( T \right) }}\le & {} Ch\left( {{{\left| p \right| }_{{W^{2,\frac{3}{2}}}\left( \Omega \right) }} + {{\left\| {\varvec{u}} \right\| }_{{W^{1,4}}\left( \Omega \right) }}} \right) . \end{aligned}$$
(3.9)

4 A Nonlinear Iteration

In this section, we present the Peaceman–Rachford (PR) iterative method developed in [15] to decouple the nonlinearity and the constraint.

First, choose an initial guess \(\left( {{\varvec{u}}_{k}^{0},{p_k^0}} \right) \) by solving a linear Darcy system:

$$\begin{aligned}&\frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}{\varvec{u}}_{k}^0 } \right) \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}} + \sum \limits _{ T \in {\mathcal {T}}_{{ k }}} { \int _T {\nabla p_{k}^0 \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}}} = \int _\Omega {{\varvec{f}} \cdot {\varvec{\varphi }}_{k} \,\, \mathrm{d}{\varvec{x}}} ,\,\,\forall {\varvec{\varphi }}_{k} \in X_{k}, \end{aligned}$$
(4.1)
$$\begin{aligned}&\sum \limits _{ T \in {\mathcal {T}}_{{ k }}} {\int _T {\nabla q_{k} \cdot {\varvec{u}}_{k}^0\,\, \mathrm{d}{\varvec{x}}}} = - \int _\Omega {gq_{k} \,\, \mathrm{d}{\varvec{x}}} + \int _{\partial \Omega } {{g_N}q_{k}\, \, \mathrm{d}{\varvec{x}}} ,\,\,\forall q_{k} \in M_{k}. \end{aligned}$$
(4.2)

The linear Darcy system (4.1)–(4.2) can be rewritten in the matrix form as

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} A&{}B\\ {{B^T}}&{}0 \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {\varvec{u}}\\ p \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} \varvec{f_{d}}\\ w \end{array}} \right] , \end{aligned}$$
(4.3)

where A is the symmetric and positive definite matrix associated to the term

$$\begin{aligned} \frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}{\varvec{u}}_{k} } \right) \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}}, \end{aligned}$$

B is the matrix corresponding to

$$\begin{aligned} \sum \limits _{ T \in {\mathcal {T}}_{{ k }}} { \int _T {\nabla p_{k} \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}}}, \end{aligned}$$

and \(\varvec{f_{d}}\) and w represent the right hand side of (4.1) and (4.2), respectively.

Then, knowing \(\left( {{\varvec{u}}_{k}^{0},{p_k^0}} \right) \), construct a sequence \(\left( {{\varvec{u}}_{k}^{n + 1},{p_k^{n + 1}}} \right) \) for \(n\ge 0\) in two steps. Let \(\alpha \) be a positive parameter chosen to enhance the convergence.

1. A nonlinear step without constraint: knowing \(\left( {{\varvec{u}}_{k}^{n},{p_k^n}} \right) \) compute the intermediate velocity \({\varvec{u}}_k^{n+\frac{1}{2}}\) by solving the following equation:

$$\begin{aligned}&\frac{1}{\alpha }\int _\Omega {\left( {{\varvec{u}}_k^{n + \frac{1}{2}} - {\varvec{u}}_k^n} \right) } \cdot {{\varvec{\varphi }} _k}\, \mathrm{d}{\varvec{x}} + \frac{\beta }{\rho }\int _\Omega {\left| {{\varvec{u}}_k^{n + \frac{1}{2}}} \right| } \left( {{\varvec{u}}_k^{n + \frac{1}{2}} \cdot {{\varvec{\varphi }} _k}} \right) \, \mathrm{d}{\varvec{x}} = \int _\Omega {{\varvec{f}} \cdot {\varvec{\varphi }}_{k} \,\, \mathrm{d}{\varvec{x}}} \nonumber \\&\quad - \frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}{\varvec{u}}_k^n} \right) } \cdot {{\varvec{\varphi }} _k}\, \mathrm{d}{\varvec{x}} - \sum \limits _{ T \in {\mathcal {T}}_{{ k }}} { \int _T {\nabla p_{k}^{n} \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}}} ,\quad \forall {\varvec{\varphi }}_{k} \in X_{k}. \end{aligned}$$
(4.4)

2. A linear step with constraint: compute \(\left( {{\varvec{u}}_{k}^{n+1},{p_k^{n+1}}} \right) \) with the known \({\varvec{u}}_k^{n+\frac{1}{2}}\)

$$\begin{aligned}&\frac{1}{\alpha }\int _\Omega {\left( {{\varvec{u}}_k^{n + 1} - {\varvec{u}}_k^{n + \frac{1}{2}}} \right) } \cdot {{\varvec{\varphi }} _k}\, \mathrm{d}{\varvec{x}} + \frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}{\varvec{u}}_k^{n+1}} \right) } \cdot {{\varvec{\varphi }} _k}\, \mathrm{d}{\varvec{x}} + \sum \limits _{ T \in {\mathcal {T}}_{{ k }}} { \int _T {\nabla p_{k}^{n+1} \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}}} \nonumber \\&\quad = \int _\Omega {{\varvec{f}} \cdot {\varvec{\varphi }}_{k} \,\, \mathrm{d}{\varvec{x}}} - \frac{\beta }{\rho }\int _\Omega {\left| {{\varvec{u}}_k^{n + \frac{1}{2}}} \right| } \left( {{\varvec{u}}_k^{n + \frac{1}{2}} \cdot {{\varvec{\varphi }} _k}} \right) \, \mathrm{d}{\varvec{x}} ,\quad \forall {\varvec{\varphi }}_{k} \in X_{k}, \end{aligned}$$
(4.5)
$$\begin{aligned}&\sum \limits _{ T \in {\mathcal {T}}_{{ k }}} {\int _T {\nabla q_{k} \cdot {\varvec{u}}_{k}^{n+1}\,\, \mathrm{d}{\varvec{x}}}} = - \int _\Omega {gq_{k} \,\, \mathrm{d}{\varvec{x}}} + \int _{\partial \Omega } {{g_N}q_{k}\, \, \mathrm{d}{\varvec{x}}} ,\quad \forall q_{k} \in M_{k}. \end{aligned}$$
(4.6)

A key observation in [15] is that because the test functions \({{\varvec{\varphi }} _k}\), the solution \( {\varvec{u}}_k^{n + \frac{1}{2}} \), and \( \nabla p_{k}^{n} \) are constant in each element T, the nonlinear step (4.4) can be solved in a closed-form:

$$\begin{aligned} {\varvec{u}}_T^{n + \frac{1}{2}} = \frac{1}{\gamma } \varvec{F}_T^{n + \frac{1}{2}} \end{aligned}$$
(4.7)

where

$$\begin{aligned} \varvec{F}_T^{n + \frac{1}{2}}&= \frac{1}{\alpha }{\varvec{u}}_T^{n} - \frac{\mu }{\rho }{\varvec{K}}_T^{-1}{\varvec{u}}_T^{n} - \nabla _T p_{k}^{n} + {\varvec{f}}_T, \\ {\varvec{K}}_T^{-1}&= \frac{1}{\left| {T} \right| } \int _T {{\varvec{K}}^{-1}\left( {\varvec{x}}\right) \, \mathrm{d}{\varvec{x}}}, \\ \gamma&= \frac{1}{2\alpha } + \frac{1}{2}\sqrt{\frac{1}{\alpha ^2} + 4\frac{\beta }{\rho }\left| {\varvec{F}_T^{n + \frac{1}{2}}} \right| }. \end{aligned}$$

In the second step, the linear system (4.5)-(4.6) can be rewritten in the following matrix form:

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {A_{\alpha } }&{}B\\ {{B^T}}&{}0 \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {\varvec{u}}\\ p \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} \varvec{f_{n + \frac{1}{2}}}\\ w \end{array}} \right] , \end{aligned}$$
(4.8)

where \(A_{\alpha }\) is the matrix corresponding to the bilinear form

$$\begin{aligned} \frac{1}{\alpha }\int _\Omega {\left( {{\varvec{u}}_k^{n + 1}} \right) } \cdot {{\varvec{\varphi }} _k}\, \mathrm{d}{\varvec{x}} + \frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}{\varvec{u}}_k^{n + 1}} \right) } \cdot {{\varvec{\varphi }} _k}\, \mathrm{d}{\varvec{x}}, \end{aligned}$$

and \(\varvec{f_{n + \frac{1}{2}}}\) is the vector corresponding to

$$\begin{aligned} \int _\Omega {{\varvec{f}} \cdot {\varvec{\varphi }}_{k} \,\, \mathrm{d}{\varvec{x}}} + \frac{1}{\alpha }\int _\Omega {\left( {{\varvec{u}}_k^{n + \frac{1}{2}}} \right) } \cdot {{\varvec{\varphi }} _k}\, \mathrm{d}{\varvec{x}} - \frac{\beta }{\rho }\int _\Omega {\left| {{\varvec{u}}_k^{n + \frac{1}{2}}} \right| } \left( {{\varvec{u}}_k^{n + \frac{1}{2}} \cdot {{\varvec{\varphi }} _k}} \right) \, \mathrm{d}{\varvec{x}}. \end{aligned}$$

In [15], the authors proved that (4.1)–(4.2) and (4.5)–(4.6) have a unique solution. The PR iterative method is convergent for an arbitrary choice of the initial guess \(\left( {{\varvec{u}}_{k}^{0},{p_k^0}} \right) \) and an arbitrary positive \( \alpha \). Numerically, different choices of \(\alpha \) will affect the convergence rate of the nonlinear iteration. We shall report a choice of \(\alpha \) in Sect. 6.

We can reduce the linear saddle point system into a SPD system when we implement the PR iteration. Because of A and \( A_{\alpha } \) are symmetric positive definite operators, without loss of generality, we take (4.8) as an example to expound an idea as follows.

Eliminate \( {\varvec{u}} \) from the first equation of (4.8), i.e.

$$\begin{aligned} {\varvec{u}} = A_{\alpha }^{-1}\left( \varvec{f_{n + \frac{1}{2}}} - Bp \right) , \end{aligned}$$
(4.9)

and then, substituting to the second equation of (4.8), we get

$$\begin{aligned} Sp = b, \end{aligned}$$
(4.10)

where \( S = B^{T}A_{\alpha }^{-1}B\), and \(b = B^{T}A_{\alpha }^{-1}\varvec{f_{n + \frac{1}{2}}} - w \). After solving (4.10), we can get \( {\varvec{u}} \) by solving (4.9).

Since \( A_{\alpha } \) is block-diagonal, \( A_{\alpha }^{-1} \) can be formed easily. Indeed Eq. (4.10) is the linear finite element discretization of an elliptic equation in the primary formulation. The equivalence between (4.9)–(4.10) and (4.8) is obvious. Solving the SPD system (4.10) is much easier than the saddle point system (4.8) and many fast solvers are available. In our numerical experiments, we use the direct solver built in \(\mathrm{MATLAB}^{\copyright }\) to solve (4.10). We could also use the multigrid solver, but due to the relative-small size of the linear SPD system we have tested, the direct solver is faster.

In the continuous level, the Darcy–Forchheimer equation can be rewritten into a nonlinear primary formulation. For simplicity, we assume that the permeability is a scalar. Taking the norm of Eq. (2.1), we obtain

$$\begin{aligned} \frac{\beta }{\rho }{\left| {\varvec{u}} \right| }^2 + \frac{\mu }{\rho K}\left| {\varvec{u}} \right| - \left| \nabla p - {\varvec{f}}\right| = 0, \end{aligned}$$

and can solve for \(\left| {\varvec{u}} \right| \)

$$\begin{aligned} \left| {\varvec{u}} \right| = \frac{-\frac{\mu }{\rho K} + \sqrt{\left( {\frac{\mu }{\rho K}}\right) ^2 + 4\frac{\beta }{\rho }\left| \nabla p - {\varvec{f}}\right| }}{2\frac{\beta }{\rho }}. \end{aligned}$$

and consequently \({\varvec{u}}\)

$$\begin{aligned} {\varvec{u}} = -\frac{\nabla p - {\varvec{f}}}{\frac{\mu }{\rho K} + \frac{\beta }{\rho }\left| {\varvec{u}} \right| } = -\frac{2\left( \nabla p - {\varvec{f}}\right) }{\frac{\mu }{\rho K} + \sqrt{\left( {\frac{\mu }{\rho K}}\right) ^2 + 4\frac{\beta }{\rho }\left| \nabla p - {\varvec{f}}\right| }}. \end{aligned}$$

Then substituting back to (2.2), we get the primary formulation of pressure p only

$$\begin{aligned} -\nabla \cdot \left( \frac{2\left( \nabla p - {\varvec{f}}\right) }{\frac{\mu }{\rho K} + \sqrt{\left( {\frac{\mu }{\rho K}}\right) ^2 + 4\frac{\beta }{\rho }\left| \nabla p - {\varvec{f}}\right| }}\right) = g. \end{aligned}$$
(4.11)

Its well-posedness can be found in [20].

In the discretization level, we could also eliminate the piecewise constant velocity and obtain an equivalent \(P_1\) discretization of (4.11). However, we only eliminate \( {\varvec{u}} \) of the linear system (4.8) in the PR iteration rather than that of the nonlinear equation (3.6) because we still need to solve the resulting nonlinear equation. The PR iteration corresponds to a variant of Picard iteration for solving (4.11). We stick to the mixed formulation as the convergence of the PR iteration has been rigorously proved in [15].

5 A Non-linear Multigrid Algorithm

In this section, we consider a generic system of nonlinear equations,

$$\begin{aligned} {\mathcal {L}}\left( {\varvec{z}} \right) = \varvec{s} \end{aligned}$$

where \( {\varvec{z}}, \varvec{s} \in \varvec{R^{n}}\). Suppose that \( {\varvec{v}} \) is an approximation to the exact solution \( {\varvec{z}} \). Define the error \({\varvec{e}} \) and the residual \( {\varvec{r}} \):

$$\begin{aligned} {\varvec{e}}&= {\varvec{z}} - {\varvec{v}}, \\ {\varvec{r}}&= \varvec{s} - {\mathcal {L}}\left( {\varvec{v}}\right) . \end{aligned}$$

Quantities in the k-th level will be denoted by a subscript k.

Because of the iterative nature, multigrid ideas should be effective on the nonlinear problem. The multigrid scheme here we used for this nonlinear problem is the most commonly used nonlinear version of multigrid. It is called the full approximation scheme (FAS) [9] because the problem in the coarse grid is solved for the full approximation \( {{\varvec{z}}_{k-1}} = I_{k}^{k-1}{\varvec{v}}_{k} + {\varvec{e}}_{k-1} \) rather than the error \({\varvec{e}}_{k-1}\). A two-level FAS is described as follows.

Full Approximation Scheme (FAS)

  1. 1.

    Pre-smoothing: For \( 1 \le j \le m, \) relax m times with an initial guess \( {\varvec{v}}^{0} \) by \( {{\varvec{v}}^j} = {R_k}{{\varvec{v}}^{j - 1}} \). The current approximation \( {{\varvec{v}}_k} = {{\varvec{v}}^m}\).

  2. 2.

    Restrict the current approximation and its fine grid residual to the coarse grid: \( {\varvec{r}}_{k-1} = I_{k}^{k-1}\left( \varvec{s}_{k} - {\mathcal {L}}_{k}\left( {\varvec{v}}_{k}\right) \right) \) and \( {\varvec{v}}_{k-1} = I_{k}^{k-1}{\varvec{v}}_{k} \).

  3. 3.

    Solve the coarse grid problem: \( {\mathcal {L}}_{k-1}\left( {\varvec{z}}_{k-1}\right) = {\mathcal {L}}_{k-1}\left( {\varvec{v}}_{k-1}\right) + {\varvec{r}}_{k-1} \).

  4. 4.

    Compute the coarse grid approximation to the error: \( {\varvec{e}}_{k-1} = {\varvec{z}}_{k-1} - {\varvec{v}}_{k-1} \).

  5. 5.

    Interpolate the error approximation up to the fine grid and correct the current fine grid approximation: \( {\varvec{v}}_{m + 1} \leftarrow {\varvec{v}}_{k} + I_{k-1}^{k}{\varvec{e}}_{k-1}\).

  6. 6.

    Post-smoothing: For \( {m + 2} \le j \le {2m + 1}, \) relax m times by \( {{\varvec{v}}^j} = {R_k}'{{\varvec{v}}^{j - 1}} \).

then we get the approximate solution \( {{\varvec{v}}^{2m + 1}} \). Here m denotes the number of pre-smoothing and post-smoothing steps, \( {R_k} \) denotes the chosen relaxation method, and \( I_{k}^{k-1} \) is an intergrid transfer operator from the fine grid to the coarse grid. As usual, the V-cycle will be obtained by applying the two-level FAS to solve the coarse grid equation in Step 3.

We choose the PR iteration (4.4)–(4.6) as the smoother \({R_k}\) and the nonlinear solver in the coarsest grid. We switch the ordering of the linear and nonlinear steps of the PR iteration in the post-smoothing step in order to keep the symmetry of the V-cycle. It is worth pointing out that although the chosen finite element spaces are nested, the constrained subspaces are non-nested when we interpolated the correction of the velocity, which was obtained in the coarser space, to the finer space. Namely, if we directly interpolated the correction obtained on the coarser grid to the finer grid, the approximation we got may not satisfy the divergence equation in this Darcy–Forchheimer model. Therefore we construct a weighted \( L^2 \) projection to map the correction obtained before into the constrained space in the fine grid which can be realized by solving a saddle point system:

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {A_{\varvec{\delta }} }&{}B\\ {{B^T}}&{}0 \end{array}} \right] \left[ {\begin{array}{*{20}{c}} \varvec{\delta }\\ \theta \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} \varvec{0}\\ B^T\varvec{e_{u}} \end{array}} \right] , \end{aligned}$$
(5.1)

where \( A_{\varvec{\delta }} \) is the matrix corresponding to

$$\begin{aligned} \frac{\mu }{\rho }\int _\Omega {\left( {{{\varvec{K}}^{ - 1}}\varvec{\delta } } \right) \cdot {\varvec{\varphi }}_{k} \, \, \mathrm{d}{\varvec{x}}} + \frac{\beta }{\rho }\int _\Omega {\left| \varvec{\delta } \right| \left( {\varvec{\delta } \cdot {\varvec{\varphi }}_{k} } \right) \, \mathrm{d}{\varvec{x}}}, \end{aligned}$$

\( \varvec{\delta }, \theta \) represent the error between the restriction of the approximation of velocity and pressure on the finer grid and their approximation obtained on the coarser grid, respectively, and \( \varvec{e_{u}} \) is the prolonged correction to the fine space. For non-nested constrained subspaces, an additional projector is usually needed to preserve the constraint [7].

Again, (5.1) can be reduced to a SPD system. We can get \( \varvec{\delta } = A_{\varvec{\delta }}^{-1}B\theta \) through the idea demonstrated in Sect. 4. Then we obtain a corrected approximation of velocity \( {\varvec{v}} = {\varvec{v}} - \varvec{\delta }\), which satisfies the divergence equation.

Remark 1

When RT or BDM element is used to discretize the velocity and the pressure is piecewise constant, we may use patch-wise smoothers designed for \(H(\mathrm{div})\) problems; see [2, 3]. The constraint can be preserved in these smoothers. A rigorous proof for the convergence of a multigrid method using constrained smoothers for linear saddle point systems can be found in [11, 12]. Note that in this paper, we consider continuous pressure discretization and nonlinear saddle point systems and thus neither the constrained smoother nor the convergence proof can be applied. \(\square \)

A convergence proof of a variant of FAS for a class of monotone nonlinear elliptic problems is given by Hackbusch in [16] and Reusken in [22]. They proved convergence by linearising the FAS iteration and used the convergence theory for linear two-grid methods for symmetric elliptic problems as in [6]. Their proof was rigorous but requiring restrictive assumptions (the initial guess is close enough to the solution). Tai and Xu in [28, 29] gave some uniform convergence estimates for a class of subspace correction methods applied to some nonlinear unconstrained and constraint convex optimization problems. But their methods is built upon nested finite element spaces and slightly expensive than FAS. Yavneh and Dardyk in [32] employed a simplified scalar analogy to provide an insight to the reason why FAS works but a rigorous proof is lacking. None of these theoretical work can be applied directly to our problem. We are investigating the convergence theory of FAS in different perspectives and will report our finding somewhere else.

6 Numerical Experiments

In this section, some numerical results are presented to illustrate the efficiency of our multigrid method for the Darcy–Forchheimer model (2.1)–(2.3). The following test problems are taken from [17]. All of our experiments are implemented based on the \(\mathrm {MATLAB}^{\copyright }\) software package iFEM [10]. They were run on a laptop with an Inter i7-4720HQ 2.60 GHz CPU and 16.0 GB RAM.

We choose \(\mu = 1\), \( \rho = 1\), \( {\varvec{K}} = \varvec{I}\), and \(\Omega \subset {\mathbb {R}}^{\mathrm {2}}\) as the square \((-1,1)^2\). We use the uniform triangulation of \(\Omega \).

  • Problem 1:

    $$\begin{aligned} {\varvec{u}}\left( {x,y}\right)= & {} \left[ {x+y,x-y}\right] ^{T},\\ p\left( {x,y}\right)= & {} x^3 + y^3,\\ {\varvec{f}}\left( {x,y} \right)= & {} \begin{bmatrix} {\left( {1 + \beta \sqrt{2{x^2} + 2{y^2}} } \right) \left( {x + y} \right) + 3{x^2}}\\ {\left( {1 + \beta \sqrt{2{x^2} + 2{y^2}} } \right) \left( {x - y} \right) + 3{y^2}} \end{bmatrix},\\ g_N\left( {x,y}\right)= & {} {\left\{ \begin{array}{ll} 1 + y, &{} \quad x = 1,\\ 1 - y, &{} \quad x = -1,\\ x - 1, &{} \quad y = 1, \\ -x - 1, &{} \quad y = -1.\\ \end{array}\right. } \end{aligned}$$
  • Problem 2:

    $$\begin{aligned} {\varvec{u}}\left( {x,y}\right)= & {} \left[ {\frac{\left( x + 1 \right) ^2}{4},-\frac{\left( x + 1 \right) \left( y + 1\right) }{2}}\right] ^{T},\\ p\left( {x,y}\right)= & {} x^3 + y^3,\\ {\varvec{f}}\left( {x,y} \right)= & {} \begin{bmatrix} {\quad \quad \,\,\frac{{{{\left( {x + 1} \right) }^2}}}{4}\left( {1 + \beta \frac{{\left( {x + 1} \right) }}{4}\sqrt{{{\left( {x + 1} \right) }^2} + 4{{\left( {y + 1} \right) }^2}} } \right) + 3{x^2}}\\ { - \frac{{\left( {x + 1} \right) \left( {y + 1} \right) }}{2}\left( {1 + \beta \frac{{\left( {x + 1} \right) }}{4}\sqrt{{{\left( {x + 1} \right) }^2} + 4{{\left( {y + 1} \right) }^2}} } \right) + 3{y^2}} \end{bmatrix},\\ g_N\left( {x,y}\right)= & {} {\left\{ \begin{array}{ll} 1, &{} \quad x = 1,\\ 0, &{} \quad x = -1,\\ -x - 1, &{} \quad y = 1, \\ 0, &{} \quad y = -1.\\ \end{array}\right. } \end{aligned}$$

Numerically Problem 2 is harder to solve. Probably it is due to the fact that the initial guess, which is obtained by solving a linear Darcy system, is further away from the true solution.

For all above test problems, \( g = 0 \). The chosen termination criterion is

$$\begin{aligned} r = r_{u} + r_{p} \le tol , \end{aligned}$$

where

$$\begin{aligned} r_{u}= & {} {\left\{ \begin{array}{ll} \left\| {\varvec{f}} - \frac{\mu }{\rho }{{\varvec{K}}^{ - 1}}{\varvec{u}}_h^n + \frac{\beta }{\rho }\left| {\varvec{u}}_h^n \right| {\varvec{u}}_h^n + \nabla p_h^n \right\| /\left\| {\varvec{f}} \right\| , &{}\quad \text { when } \left\| {\varvec{f}} \right\| \ne 0,\\ \left\| {\varvec{f}} - \frac{\mu }{\rho }{{\varvec{K}}^{ - 1}}{\varvec{u}}_h^n + \frac{\beta }{\rho }\left| {\varvec{u}}_h^n \right| {\varvec{u}}_h^n + \nabla p_h^n \right\| , &{}\quad \text { when } \left\| {\varvec{f}} \right\| = 0. \end{array}\right. }\\ r_{p}= & {} {\left\{ \begin{array}{ll} \left\| g - \text {div}{\varvec{u}}_h^n \right\| /\left\| g \right\| , &{}\quad \text { when } \left\| g \right\| \ne 0,\\ \left\| g - \text {div}{\varvec{u}}_h^n \right\| , &{}\quad \text { when } \left\| g \right\| = 0.\\ \end{array}\right. } \end{aligned}$$

We first use the accuracy test to confirm that our nonlinear multigrid iteration will convergent to an approximation of the problem of consideration. In the following experiments, the letter N stands for ‘Number of unknowns of p’, which is the same as ‘Numbers of vertices’, so \( h = \frac{2}{\sqrt{N} - 1 } \), which represents the mesh size in one direction. Numerical results, see Figs. 1a and 2a, confirmed the convergence order for \( {\left\| {{\varvec{u}} - {\varvec{u}}_{h}} \right\| _{{L^2}}} \) and \( {\left\| {p - {p_h}} \right\| _{{H^1}}} \) are \(O\left( h \right) = O(N^{1/2})\). The accuracy of the pressure approximations, however, is not as good as that of velocity. Meanwhile, in consideration of the computation cost, the sufficiently accurate results were achieved when \( tol = 10^{-6} \) for Problems 1 and 2. The stopping tolerance can be varying in different levels to further reduce the cost. A guide line is below the truncation error [8]. The authors in [17], however, use \( tol = 1.95h \), which is only enough for the \( L^{2} \)-norm approximation for velocity. We shall use \( tol = 10^{-6}\) in the remaining numerical experiments.

Fig. 1
figure 1

Convergence rate by using multigrid solver and time complexity by using different solvers for Problem 1 with \(\beta = 30\). a Convergence rate by using multigrid solver, b time complexity

Fig. 2
figure 2

Convergence rate by using multigrid solver and time complexity by using different solvers for Problem 2 with \(\beta = 30\). a Convergence rate by using multigrid solver, b time complexity

For all tests, the iteration steps and CPU time of each solver are listed in tables. We are aware that the CPU time depends on the implementation and testing environment: the programming language, optimization of codes, and the hardware (memory and cache), etc. Our code has been optimized using vectorization technique and all results were measured and compared in the same test environment so that the CPU time could be a good indicator of the efficiency. The CPU time will be also used to find the asymptotic time complexity of each method; see Figs. 1b and 2b.

As it has been proved in [15], the PR nonlinear iteration converges for any \(\alpha > 0 \). Its rate of convergence, however, is very sensitive to the choice of this parameter. From the convergence proof of the PR iteration in [15], we inferred that the choices of \( \alpha \) depends on the Forchheimer number \( \beta \) which controls the magnitude of the nonlinearity as \(\rho \) is fixed. We give an empirical choice of parameter \( \alpha = 1/\beta \) and compared with the choice \(\alpha = 1\) suggested in [17] in Tables 1 and 2. As shown in Tables 1 and 2, the choice of the parameter \( \alpha = 1/\beta \) is much better than the fixed selection for different values of \( \beta \). Therefore, this choice of \( \alpha \) will be used in the remaining numerical experiments.

Table 1 Comparison of different values of \(\alpha \) in PR iteration with \(h = \frac{1}{64}\) for \( \beta = 10, 20, 30\)
Table 2 Comparison of different values of \(\alpha \) in PR iteration with \(h = \frac{1}{64}\) for \( \beta = 40, 50, 60\)

We then compare the FAS multigrid method using PR as smoother with the PR iterative method for solving this nonlinear system. Here we choose \(m = 3\) for all the following tests. It means that we apply three PR iterations in the pre-smoothing step and post-smoothing step, respectively. Each V-cycle step is approximately 9 PR iterations (6 for the finest level and 3 for iterations in all coarser levels as the size of the system is reduced by 1 / 4) in terms of complexity. In order to keep the symmetry of the V-cycle, we switch the ordering of the linear and nonlinear steps of the PR iteration in the post-smoothing step. We set \( h =1/16 \) as the coarsest mesh and solve the nonlinear problem in the coarsest mesh using PR iteration.

The PR solver is denoted by pr, whereas the multigrid solver is denoted by mg. I - number of iterations, and CPU - CPU time. ‘s1’ represents that we solve these linear saddle point systems (4.8) directly in each step, ‘s2’ is that we solve the primal SPD system (4.10) mentioned in Sect. 4 rather than solving the saddle point system. ‘mg’ stands for our multigrid solver, in which the PR iteration is constructed based on ‘s2’. In all examples we achieve optimal order convergence of \( {\left\| {{\varvec{u}} - {\varvec{u}}_{h}} \right\| _{{L^2}}} \) and \( {\left\| {p - {p_h}} \right\| _{{H^1}}}\). Compared with the PR iteration, we can obtain the same accuracy by using our multigrid method with less iterations. We can get similar results for different values of the Forchheimer number \(\beta \).

Since our focus is on the efficiency of solvers, we mainly report the comparison of the number of iterations and CPU time by using different solvers. Numerical tests were performed for several cases of different values of the Forchheimer number \(\beta \) for Problems 1 and 2, and the behavior of these experiments is similar for all chosen cases. All problems are becoming harder to solve as the Forchheimer number \(\beta \) increases, mainly because \(\beta \) enhances the nonlinearity. Therefore, without loss of critical substance and clarity, here we only show the results for \( \beta = 30 \) to demonstrate the merits of our method.

Table 3 Comparison of number of iterations and CPU time of Problem 1 by using different solvers with \(\beta =30\)
Table 4 Comparison of iteration steps of multigrid solver according to different h and \(\beta \) for Problem 1 with \( \alpha = 1/\beta \)
Table 5 Comparison of number of iterations and CPU time of Problem 2 by using different solvers with \(\beta = 30\)
Table 6 Comparison of iteration steps of multigrid solver according to different h and \(\beta \) for Problem 2 with \( \alpha = 1/\beta \)

It can be observed that our multigrid solver required significantly fewer iterations and CPU time than the other two solvers in Tables 3 and 5. More importantly, iteration steps are uniformly stable with respect to h and the time complexity of our multigrid solver is nearly linear, i.e., O(N) , shown in Figs. 1 and 2. In contrast, for the PR methods, iteration steps increase as h decreases and the time complexity seems to be more than linear. For the largest size we have tested, our multigrid solver is more than 40 times faster than the original PR iteration. In Tables 4 and 6, the number of iterations are compared for different values of \(\beta \) and it is demonstrated that our multigrid method is also robust to both mesh size h and the Forchheimer number \(\beta \) while PR iteration is not, see Tables 1 and 2. It is worth noting that even for a linear Stokes type equation, construct a solver robust to a critical parameter is not easy [18, 19].

7 Conclusions

In this paper, we constructed a nonlinear multigrid method for a mixed finite element method of the two-dimensional Darcy–Forchheimer model. We presented a comparative study between the multigrid solver and the PR iterative solver, at the same time compared CPU time of the efficient solver of solving the SPD systems with that obtained by solving the linear saddle point systems directly. We took into account the pressure accuracy when we set the termination criterion, and chose a better value of the stopping criterion tol. In comparison with the authors in [17] always chose \( \alpha = 1\) for different values of the Forchheimer number \( \beta \), we reported a better choice and compared with the previous choice through comparing the number of iterations and CPU time. The results obtained from our tests indicate that the multigrid solver is very efficient for numerically solving this nonlinear elliptic equation. The number of iterations and CPU time for using multigrid solver are shown to be significantly less than that obtained by using the PR iteration alone.

In the future work, we shall extend our results to three directions. One is that we would like to find a better smoother, which is used in the pre-smoothing and post-smoothing step, to further reduce CPU time and make the multigrid solver more efficient. Another is that we intend to carry out some studies on the three-dimensional Darcy–Forchheimer problem and the real application in a porous medium. We shall also investigate the theoretical study of the convergence proof of FAS.