1 Introduction

Power penalty methods for the parabolic linear complementarity problem arising from American options pricing problems have attracted much attention; see, for example [1, 7, 12, 1416]. In these methods, the parabolic linear complementarity problem is approximated by a nonlinear parabolic partial differential equation (which is called a penalized equation) via adding a penalty term. It is shown that the solution of the penalized equation converges to that of the parabolic linear complementarity problem with an arbitrary order and this allows us to achieve the required accuracy of the solution with a small penalty parameter [10, 12]. After the penalized equation is discretized in space and time, at each timestep it yields the following nonlinear discrete penalized equation: Find a vector \(W^k_\lambda \in \mathbb {R}^n\) such that

$$\begin{aligned} AW ^k_\lambda -\lambda \left[ g-W^k_\lambda \right] ^{1/k}_+=f, \end{aligned}$$
(1.1)

where \(A=(a_{ij})\in \mathbb {R}^{n\times n}\) is an M-matrix, \(f=(f_1,\ldots ,f_n)^T\in \mathbb {R}^n, g=(g_1,\ldots ,g_n)^T\in \mathbb {R}^n, \lambda >1\) is the penalty parameter, \(k\ge 1\) is the power exponent and \([u]_+=\max \{0,u\}\). The solution of the penalized equation \(W^k_\lambda \) is closely related to \(\lambda \) and \(k\). A matrix \(B\) is called an M-matrix if it has non-positive off-diagonals and \(B^{-1}\ge 0\) (i.e., all entries of \(B^{-1}\) are nonnegative) [8]. The penalized equation (1.1) can be regarded as the penalized formulation of the following discrete linear complementarity problem

$$\begin{aligned} W\ge g,\quad AW -f\ge 0\quad \mathrm{and}\quad (W-g)^T( AW -f)=0. \end{aligned}$$
(1.2)

In this paper, we shall study the relationships among the solutions of penalized equations and the relationships between the solution of the penalized equation and that of the discrete linear complementarity problem.

In order to price American options numerically, large-scale systems of nonlinear algebraic equations need to be solved at each discrete time point. When \(k=1\) in (1.1), the penalized equation is a semismooth equation and semismooth Newton method can be used to solve it [2, 4, 9]. When \(k>1\), the penalized equation (1.1) is a nonsmooth equation and classical or semismooth Newton method can not be applied directly. In [12], Wang et al. proposed a smoothing technique for the penalty term and with this technique, a damped Newton method was applied to solving the penalized equation. Compared with the cases where \(k=1\), it requires much more computational time to solve the penalized equation by the damped Newton method [14].

In this paper, we shall apply the nonlinear Jacobian iteration (see, e.g., [8]) to the penalized equation (1.1). A nonlinear SOR iteration was used to solve the discrete penalized equation arising from obstacle problems in [3] when \(k=1\). The nonlinear SOR iteration typically converges faster than the nonlinear Jacobi iteration by using the most recently available approximations of the elements of the iteration vector. However, the method is inherently sequential and it does not possess natural parallelism. Under proper conditions, we shall verify that the sequence generated by the nonlinear Jacobian iteration monotonically converges to the solution of the penalized equation.

The paper is organized as follows. In Sect. 2, we shall discuss convergence properties of solutions of penalized equations. In Sect. 3, we shall apply the nonlinear Jacobi iteration for solving the penalized equation and study the convergence property of the iteration. In Sect. 4, we shall discuss the solution of the subproblem. Finally, in Sect. 5, numerical experiments are presented to confirm the efficiency of the proposed method and to verify the theoretical results.

2 Convergence properties of solutions of penalized equations

We call \(W\) a lower-solution for the penalized equation (1.1) if

$$\begin{aligned} AW -\lambda [g-W]^{1/k}_+\le f; \end{aligned}$$

we also call \(W\) a lower-solution for the discrete linear complementarity problem (1.2) if

$$\begin{aligned} \min \{W-g, AW -f\}\le 0. \end{aligned}$$

For two index subsets \(I, J\subset \{1,\ldots ,n\}\), let \(A_{IJ}\) be the submatrix of \(A\) with rows in \(I\) and columns in \(J\) and \(W_I\) be the subvector of \(W\) with entries indexes in \(I\). A matrix \(B\) (or a vector \(W\)) is nonnegative, denoted by \(B\ge 0\) (or \(W\ge 0\)), if all its entries are nonnegative. With these notions, we present some important relationships among the solutions of penalized equations and the discrete linear complementarity problem in the following lemmas.

Lemma 2.1

Let \(\lambda >1, k>0\) and \(W_\lambda ^k\) be the solution of the penalized equation (1.1). If \(\widetilde{W}\) is a lower-solution of the penalized equation (1.1), then \(\widetilde{W}\le W_{\lambda }^k\).

Proof

Since \(\widetilde{W}\) is a lower-solution of the penalized equation (1.1), it holds

$$\begin{aligned} A\widetilde{W}-\lambda [g-\widetilde{W}]^{1/k}_+\le f= AW _\lambda ^k-\lambda \left[ g-W_\lambda ^k\right] ^{1/k}_+. \end{aligned}$$
(2.1)

This means that

$$\begin{aligned} A\left( \widetilde{W}-W_\lambda ^k\right) \le \lambda \left( [g-\widetilde{W}]^{1/k}_ +-\left[ g-W_\lambda ^k\right] ^{1/k}_+\right) . \end{aligned}$$

Assume that there exist two disjoint nonempty index subsets \(I_1\) and \(I_2\) of \(\{1,\ldots ,n\}\) such that

$$\begin{aligned} \widetilde{W}_i\le \left( W_\lambda ^k\right) _i,\quad \forall i\in I_1\quad \mathrm{and}\quad \widetilde{W}_i> \left( W_\lambda ^k\right) _i,\quad \forall i\in I_2. \end{aligned}$$
(2.2)

From (2.1) it follows

$$\begin{aligned} A_{I_2I_2}\left( \widetilde{W}-W_\lambda ^k\right) _{I_2}\le \lambda \left( [g-\widetilde{W}]^{1/k}_+-\left[ g-W_\lambda ^k\right] ^{1/k}_+\right) _{I_2} -A_{I_2I_1}\left( \widetilde{W}-W_\lambda ^k\right) _{I_1}. \end{aligned}$$

Noting that \(A\) is an M-matrix, we obtain that \(A_{I_2I_1}\le 0\) and \(A_{I_2I_2}\) is also an M-matrix. From (2.2) we deduce

$$\begin{aligned} \left( \widetilde{W}-W_\lambda ^k\right) _{I_2}\le 0. \end{aligned}$$

This is a contradiction to (2.2). The contradiction implies that \(I_2=\emptyset \). Therefore,

$$\begin{aligned} \widetilde{W}\le W_{\lambda }^k. \end{aligned}$$

The proof is complete.\(\square \)

Lemma 2.2

Let \(1<\lambda _1<\lambda _2\) and \(W_{\lambda _1}^k\) be the solution of the penalized equation

$$\begin{aligned} AW -\lambda _1[g-W]^{1/k}_+=f. \end{aligned}$$

Then \(W_{\lambda _1}^k\) is a lower-solution of the penalized equation

$$\begin{aligned} AW -\lambda _2[g-W]^{1/k}_+=f. \end{aligned}$$
(2.3)

Moreover, \(W_{\lambda _1}^k\le W_{\lambda _2}^k\), where \(W_{\lambda _2}^k\) is the solution of (2.3).

Proof

Since \(1<\lambda _1<\lambda _2, [g-W_{\lambda _1}^k]_+\ge 0\) and \(W_{\lambda _1}^k\) is the solution of the penalized equation \( AW -\lambda _1[g-W]^{1/k}_+=f\), it must hold

$$\begin{aligned} AW _{\lambda _1}^k-\lambda _2\left[ g-W_{\lambda _1}^k\right] ^{1/k}_+\le AW _{\lambda _1}^k-\lambda _1\left[ g-W_{\lambda _1}^k\right] ^{1/k}_+=f. \end{aligned}$$

This means that \(W_{\lambda _1}^k\) is a lower-solution of the penalized equation (2.3). Since \(W_{\lambda _2}^k\) is the solution of (2.3), by Lemma 2.1 we get \(W_{\lambda _1}^k\le W_{\lambda _2}^k\). The proof is complete.\(\square \)

Lemma 2.3

Let \(0<k_1<k_2\) and \(W_\lambda ^{k_1}\) be the solution of the penalized equation

$$\begin{aligned} AW -\lambda [g-W]^{1/k_1}_+=f. \end{aligned}$$

Assume that for each \(i\), it holds \((W_\lambda ^{k_1})_i\ge g_i-1\). Then \(W_\lambda ^{k_1}\) is a lower-solution of the penalized equation

$$\begin{aligned} AW -\lambda [g-W]^{1/k_2}_+=f. \end{aligned}$$
(2.4)

Moreover, \(W_\lambda ^{k_1}\le W_\lambda ^{k_2}\), where \(W_\lambda ^{k_2}\) is the solution of (2.4).

Proof

Since \(0<k_1<k_2, 0\le [g_i-(W_\lambda ^{k_1})_i]_+\le 1\) and \(W_\lambda ^{k_1}\) is the solution of the penalized equation \( AW -\lambda [g-W]^{1/k_1}_+=f\), it must hold

$$\begin{aligned} AW _\lambda ^{k_1}-\lambda \left[ g-W_{\lambda _1}^{k_1}\right] ^{1/k_2}_+\le AW _\lambda ^{k_1}-\lambda \left[ g-W_{\lambda _1}^{k_1}\right] ^{1/k_1}_+=f. \end{aligned}$$

This means that \(W_\lambda ^{k_1}\) is a lower-solution of the penalized equation (2.4). Noting that \(W_\lambda ^{k_2}\) is the solution of (2.4), from Lemma 2.1 we deduce \(W_\lambda ^{k_1}\le W_\lambda ^{k_2}\).\(\square \)

Lemma 2.4

Let \(\lambda >1\) and \(k>0\). Assume that \(W_\lambda ^k\) is the solution of the penalized equation (1.1). Then \(W_\lambda ^k\) is a lower-solution of the discrete linear complementarity problem (1.2). Moreover, \(W_\lambda ^k\le W^*\), where \(W^*\) denotes the solution of (1.2).

Proof

Define two index sets \(I_1\) and \(I_2\) as follows

$$\begin{aligned} I_1\triangleq \left\{ i\ |\ \left( W_\lambda ^k\right) _i\ge g_i \right\} \quad \mathrm{and}\quad I_2\triangleq \left\{ i\ |\ \left( W_\lambda ^k\right) _i< g_i\right\} . \end{aligned}$$
(2.5)

For each \(i\in I_1\), by (1.1) we have \(( AW _\lambda ^k)_i=f_i\), which means that \(\min \{(W_\lambda ^k-g)_i,\ ( AW _\lambda ^k-f)_i\}=0\); and for each \(i\in I_2\), we have \((W_\lambda ^k-g)_i<0\), which means that \(\min \{(W_\lambda ^k-g)_i,\ ( AW _\lambda ^k-f)_i\}<0\). This implies that for each \(i\), it holds

$$\begin{aligned} \min \left\{ \left( W_\lambda ^k-g\right) _i,\ \left( AW _\lambda ^k-f\right) _i\right\} \le 0. \end{aligned}$$

And hence \(W_\lambda ^k\) is a lower-solution of the discrete linear complementarity problem (1.2).

Noting that \(W^*\ge g\), by (2.5) we have \(W^*_i>\left( W_\lambda ^k\right) _i\) for each \(i\in I_2\). On the other hand, we have

$$\begin{aligned} \left( AW _\lambda ^k\right) _{I_1}=f_{I_1}\le ( AW ^*)_{I_1}. \end{aligned}$$

This means that

$$\begin{aligned} A_{I_1I_1}\left( W^*-W_\lambda ^k\right) _{I_1}\ge -A_{I_1I_2} \left( W^*-W_\lambda ^k\right) _{I_2}\ge 0. \end{aligned}$$

Since \(A_{I_1I_1}\) is an M-matrix, we have \((W^*-W_\lambda ^k)_{I_1}\ge 0\). And hence \(W_\lambda ^k\le W^*\).\(\square \)

The following theorem shows that the sequence of solutions \(\{W_{\lambda _m}^k\}\) of the penalized equations is monotonically increasing and convergent to the solution of the discrete linear complementarity problem (1.2) as \(\lambda _m\rightarrow +\infty \).

Theorem 2.1

Let \(k\) be fixed and \(\{\lambda _m\}\) be a monotonically increasing sequence tending to positive infinity as \(m\rightarrow +\infty \). Assume that \(W_{\lambda _m}^k\) is the solution of the penalized equation

$$\begin{aligned} AW -\lambda _m[g-W]_+^{1/k}=f. \end{aligned}$$

Then the sequence \(\{W_{\lambda _m}^k\}\) is monotonically increasing and convergent to the solution of the discrete linear complementarity problem (1.2). Moreover, when \(A\) is positive definite, there exists a constant \(C>0\), independent of \(n, W^*, W_{\lambda _m}^k\) and \(\lambda _m\), such that

$$\begin{aligned} \left\| W^*-W_{\lambda _m}^k\right\| _2\le \frac{C}{(\lambda _m)^{k/2}}, \end{aligned}$$
(2.6)

where \(W^*\) is the solution of the discrete linear complementarity problem (1.2).

Proof

It follows from Lemmas 2.2 and 2.4 that

$$\begin{aligned} W_{\lambda _{m_1}}^k\le W_{\lambda _{m_2}}^k\le W^*,\quad \forall m_1<m_2. \end{aligned}$$

This implies that there exists some \(W^{**}\) such that \(\lim \nolimits _{m\rightarrow +\infty }W_{\lambda _m}^k=W^{**}\). Note that

$$\begin{aligned} AW _{\lambda _m}^k-f=\lambda _m\left[ g-W_{\lambda _m}^k\right] _+^{1/k}\ge 0. \end{aligned}$$
(2.7)

Letting \(m\rightarrow +\infty \) in (2.7), we get \( AW ^{**}-f\ge 0\). On the other hand, we have

$$\begin{aligned} \left[ g-W_{\lambda _m}^k\right] _+=\left( \left( AW _{\lambda _m}^k-f\right) /\lambda _m\right) ^k. \end{aligned}$$
(2.8)

Letting \(m\rightarrow +\infty \) in (2.8), we get \([g-W^{**}]_+=0\). This means that \(W^{**}\ge g\). And hence it holds

$$\begin{aligned} \left( AW ^{**}-f\right) ^T(W^{**}-g)\ge 0. \end{aligned}$$
(2.9)

By (2.7), we get

$$\begin{aligned} \left( AW _{\lambda _m}^k-f\right) ^T\left( W_{\lambda _m}^k-g\right) =\lambda _m\left( W_{\lambda _m}^k-g\right) ^T\left[ g-W_{\lambda _m}^k\right] _+^{1/k}\le 0. \end{aligned}$$

Letting \(m\rightarrow +\infty \), we obtain \(( AW ^{**}-f)^T(W^{**}-g)\le 0.\) This together with (2.9) implies that

$$\begin{aligned} ( AW ^{**}-f)^T(W^{**}-g)=0. \end{aligned}$$

Thus the above discussion shows that \(W^{**}\) is a solution of (1.2). Noting that \(A\) is an M-matrix, the linear complementarity problem (1.2) has a unique solution. Therefore, \(W^{**}=W^*\). The upper error bound (2.6) comes from Theorem 2.1 of [13] directly. The proof is complete.\(\square \)

Remark 2.1

  1. (i)

    The constant \(C\) in (2.6) is independent of the dimensionality \(n\) [13]. The upper error bound (2.6) can be further improved by the fact that all norms in a finite dimensional space are equivalent (see [5, 13] for details), i.e.,

    $$\begin{aligned} \left\| W^*-W_{\lambda _m}^k\right\| _2\le \frac{C}{(\lambda _m)^k}. \end{aligned}$$

    But the constant \(C\) here is dependent of the dimensionality \(n\).

  2. (ii)

    In general, the sequence of solutions \(\{W_\lambda ^{k_m}\}\) of the penalized equations may not converge to the solution of the discrete linear complementarity problem if we let \(\lambda \) be fixed and \(k_m\rightarrow +\infty \). The following example has shown this. Consider the linear complementarity problem

    $$\begin{aligned} W^*-3\ge 0,\quad 2W^*-2\ge 0\quad \mathrm{and}\quad (W^*-3)(2W^*-2)=0. \end{aligned}$$

    and the corresponding penalized equation

    $$\begin{aligned} 2W_2^k-2\left[ 3-W_2^k\right] _+^{1/k}=2, \end{aligned}$$

    where \(\lambda =2\) and \(k>0\). By a simple calculation, we get \(W_2^k=2\) and \(W^*=3\). Therefore, the sequence of solutions \(W_2^{k_m}\) cannot converges to the solution as \(k_m\rightarrow +\infty \).

Theorem 2.2

Let \(A\) be positive definite, \(\lambda >\Vert Ag-f\Vert _1\) be fixed and \(\{k_m\}\) be a monotonically increasing sequence tending to positive infinity as \(m\rightarrow +\infty \). Assume that \(W_\lambda ^{k_m}\) is the solution of the penalized equation

$$\begin{aligned} AW -\lambda [g-W]_+^{1/{k_m}}=f. \end{aligned}$$

If \(\lambda \) is chosen such that for each \(i, g_i\le (W_\lambda ^{k_1})_i+1\), then the sequence \(\{W_\lambda ^{k_m}\}\) is monotonically increasing and convergent to the solution of the discrete linear complementarity problem (1.2).

Proof

From Lemmas 2.3 and 2.4 it follows that the sequence \(\{W_\lambda ^{k_m}\}\) is monotonically increasing and bounded. As a result, there exists some \(W^{**}\) such that \(\lim \nolimits _{m\rightarrow +\infty }W_\lambda ^{k_m}=W^{**}\). Note that

$$\begin{aligned} AW _\lambda ^{k_m}-f=\lambda \left[ g-W_\lambda ^{k_m}\right] _+^{1/{k_m}}\ge 0. \end{aligned}$$
(2.10)

Letting \(m\rightarrow +\infty \) in (2.10), we get \( AW ^{**}-f\ge 0\). On the other hand, we have

$$\begin{aligned} 0\le \lambda \left\| \left[ g-W_\lambda ^{k_m}\right] _ +\right\| ^p_p=-\left( g-W_\lambda ^{k_m}\right) ^TA\left[ g-W_\lambda ^{k_m}\right] _+ +(Ag-f)^T\left[ g-W_\lambda ^{k_m}\right] _+, \end{aligned}$$
(2.11)

where \(p=1+1/{k_m}\). We decompose \(g-W_\lambda ^{k_m}\) into \(g-W_\lambda ^{k_m}=[u_1^T,u_2^T]^T\) with \(u_1>0\) and \(u_2\le 0\). Then the matrix \(A\) can be decomposed into \(A=\left( \begin{array}{ll}A_{11}&{}A_{12}\\ A_{21}&{}A_{22}\end{array}\right) \) and (2.11) can be rewritten as

$$\begin{aligned} 0\le \lambda \left\| \left[ g-W_\lambda ^{k_m}\right] _+\right\| ^p_p=-u_1^TA_{11}u_1 -u_1^TA_{12}u_2+(Ag-f)^T\left[ g-W_\lambda ^{k_m}\right] _+. \end{aligned}$$

Since \(u_1^TA_{11}u_1\ge 0\) and \(u_1^TA_{12}u_2\ge 0\), it holds

$$\begin{aligned} 0\le \lambda \left\| \left[ g-W_\lambda ^{k_m}\right] _ +\right\| ^p_p\le (Ag-f)^T\left[ g-W_\lambda ^{k_m}\right] _+. \end{aligned}$$

Using the Hȯlder’s inequality, we get

$$\begin{aligned} \left\| \left[ g-W_\lambda ^{k_m}\right] _+\right\| _\infty \le \left( \Vert Ag-f\Vert _1/\lambda \right) ^{k_m}. \end{aligned}$$
(2.12)

Note that \(\lambda >\Vert Ag-f\Vert _1\). Letting \(m\rightarrow +\infty \) in (2.12), we get \([g-W^{**}]_+=0\). This means that \(W^{**}\ge g\). Since \( AW ^{**}-f\ge 0\), it holds

$$\begin{aligned} ( AW ^{**}-f)^T(W^{**}-g)\ge 0. \end{aligned}$$
(2.13)

By (2.10), we get

$$\begin{aligned} \left( AW _\lambda ^{k_m}-f\right) ^T\left( W_\lambda ^{k_m}-g\right) =\lambda \left( W_\lambda ^{k_m}-g\right) ^T\left[ g-W_\lambda ^{k_m}\right] _+^{1/{k_m}}\le 0. \end{aligned}$$

Letting \(m\rightarrow +\infty \), we obtain \(( AW ^{**}-f)^T(W^{**}-g)\le 0\), which together with (2.13) implies that

$$\begin{aligned} ( AW ^{**}-f)^T(W^{**}-g)=0. \end{aligned}$$

Thus, \(W^{**}\) is a solution of (1.2). Noting that the linear complementarity problem (1.2) has a unique solution, we get \(W^{**}=W^*\). The proof is complete.\(\square \)

Theorem 2.3

Let \(W^*\) be the solution of the discrete linear complementarity problem (1.2). Assume that \(W_\lambda ^k\) is the solution of the penalized equation (1.1). Define the following two index sets

$$\begin{aligned} I_1=\left\{ i\ |\ \left( W_\lambda ^k\right) _i\le g_i\right\} \quad \mathrm{and}\quad I_2=\left\{ i\ |\ \left( W_\lambda ^k\right) _i> g_i\right\} . \end{aligned}$$

Then for sufficiently large \(\lambda \), we have

$$\begin{aligned} \left\| \left( g-W_\lambda ^k\right) _{I_1}\right\| _\infty \le \frac{C_1}{\lambda ^k},\quad \left\| \left( W_\lambda ^k-W^*\right) _{I_2}\right\| _\infty \le \frac{C_2}{\lambda ^k}, \end{aligned}$$

and

$$\begin{aligned} \left\| W^*-W_\lambda ^k\right\| _\infty \le \frac{C_3}{\lambda ^k}. \end{aligned}$$

Here, constants \(C_1, C_2\) and \(C_3\) only depend on \(k\) and \(\Vert AW ^*-f\Vert _\infty \).

Proof

For any \(i\in I_1\), by (1.1) we obtain

$$\begin{aligned}\begin{array}{ll} (g-W_\lambda ^k)_i&{}\displaystyle =\frac{\left[ \left( AW _\lambda ^k-f\right) _i\right] ^k}{\lambda ^k} =\frac{\left[ \left( AW ^*-f+A\left( W_\lambda ^k-W^*\right) \right) _i\right] ^k}{\lambda ^k}\\ &{}\displaystyle \le \frac{\left[ \left( AW ^*-f\right) _i+|\left( A\left( W_\lambda ^k-W^*\right) \right) _i |\right] ^k}{\lambda ^k}.\end{array} \end{aligned}$$

Noting that \(W_\lambda ^k-W^*\rightarrow 0\) as \(\lambda \rightarrow \infty \), we have \(|(A(W_\lambda ^k-W^*))_i|<\max \{( AW ^*-f)_i,1\}\) for \(\lambda \) sufficiently large. This implies that

$$\begin{aligned} \left( g-W_\lambda ^k\right) _i\le \frac{\left[ 2\max \{( AW ^*-f)_i,1\}\right] ^k}{\lambda ^k}\le \frac{2^k\left[ \max \{\Vert AW ^*-f\Vert _\infty ,1\}\right] ^k}{\lambda ^k},\quad \forall i\in I_1. \end{aligned}$$

Let \(C_1=2^k[\max \{\Vert AW ^*-f\Vert _\infty ,1\}]^k\). Then it holds

$$\begin{aligned} \left\| \left( g-W_\lambda ^k\right) _{I_1}\right\| _\infty \le \frac{C_1}{\lambda ^k}. \end{aligned}$$
(2.14)

For any \(i\in I_2\), by (1.1) we get \(( AW ^k_\lambda -f)_i=0\). Since \(W^*_i\ge (W^k_\lambda )_i\), it holds \(W^*_i>g_i\) and hence \(( AW ^*-f)_i=0\). Therefore, we have

$$\begin{aligned} 0=\left[ A\left( W^*-W^k_\lambda \right) \right] _{I_2}=A_{I_2I_2} \left( W^*-W^k_\lambda \right) _{I_2}+A_{I_2I_1}\left( W^*-W^k_\lambda \right) _{I_1}, \end{aligned}$$

which implies that

$$\begin{aligned} \left( W^*-W^k_\lambda \right) _{I_2}=-(A_{I_2I_2})^{-1}A_{I_2I_1}\left( W^*-W^k_\lambda \right) _{I_1}. \end{aligned}$$

Note that \(W_\lambda ^k\rightarrow W^*\) as \(\lambda \rightarrow \infty \). For sufficiently large \(\lambda \), we have \((W^k_\lambda )_i>g_i\) as long as \(W_i^*>g_i\). This means that \(W^*_{I_1}=g_{I_1}\) and hence

$$\begin{aligned} \left\| W^*-W^k_\lambda \right\| _\infty \!=\left\| (A_{I_2I_2})^{-1}A_{I_2I_1} \left( g-W^k_\lambda \right) _{I_1}\right\| _\infty \!\le \left\| (A_{I_2I_2})^{-1} A_{I_2I_1}\right\| _\infty \left\| \left( g\!-\!W^k_\lambda \right) _{I_1}\right\| _\infty \!. \end{aligned}$$

Let \(C_2=C_1\Vert (A_{I_2I_2})^{-1}A_{I_2I_1}\Vert _\infty \). Then it holds

$$\begin{aligned} \left\| \left( W_\lambda ^k-W^*\right) _{I_2}\right\| _\infty \le \frac{C_2}{\lambda ^k}. \end{aligned}$$
(2.15)

Let \(C_3=C_1\max \{\Vert (A_{I_2I_2})^{-1}A_{I_2I_1}\Vert _\infty ,1\}\). From (2.14) and (2.15), it follows

$$\begin{aligned} \left\| W_\lambda ^k-W^*\right\| _\infty \le \frac{C_3}{\lambda ^k}. \end{aligned}$$

The proof is complete.\(\square \)

Using Theorem 2.3, we get the following corollary.

Corollary 2.1

Let \(\lambda _1<\lambda _2\), and \(W_{\lambda _1}^k\) and \(W_{\lambda _2}^k\) be the corresponding solutions of the penalized equation (1.1). Then there exists some constant \(C\) such that

$$\begin{aligned} \left\| W_{\lambda _1}^k-W_{\lambda _2}^k\right\| _\infty \le \frac{C}{\lambda _1^k}. \end{aligned}$$

3 Nonlinear Jacobi iteration for solving the penalized equation

In this section, we shall apply the nonlinear Jacobi iteration for the solution of the discrete penalized equation (1.1) and analyze its convergence. For this purpose, we introduce some notations. We denote by \(D=\mathrm{diag}(a_{11},a_{22},\ldots ,a_{nn})\) the diagonal matrix composed of the diagonal elements of \(A\). Then, \(B\triangleq A-D\) denotes the matrix composed of the off-diagonal elements of \(A\). With these notations, we give the steps of the algorithm as follows.

Algorithm 3.1.

  • Step 1. Choose an initial point \(W^1\in \mathbb {R}^n\) and \(\epsilon >0\). Let \(\ell =1\).

  • Step 2. Compute \(W^{\ell +1}\) by solving the system of nonlinear equations

    $$\begin{aligned} DW^{\ell +1}-\lambda \left[ g-W^{\ell +1}\right] _+^{1/k}=f- BW ^\ell . \end{aligned}$$
    (3.1)
  • Step 3. Stop if \(\Vert W^{\ell +1}-W^\ell \Vert _\infty <\epsilon \). Otherwise, let \(\ell :=\ell +1\) and go to Step 2.

Remark 3.1

Let \(b^\ell = f- BW ^\ell \). Noting that \(D\) is a diagonal matrix, we can compute \(W^{\ell +1}\) in (3.1) by solving in parallel the following \(n\) one-dimensional nonlinear equations

$$\begin{aligned} a_{ii}W^{\ell +1}_i-\lambda \left[ g_i-W^{\ell +1}_i\right] _+^{1/k}=b^\ell _i,\quad i=1,2,\ldots ,n. \end{aligned}$$
(3.2)

The following lemma shows that the system (3.1) has a unique solution.

Lemma 3.1

For any \(\ell \ge 1\), the system of nonlinear equations (3.1) has a unique solution. Moreover,

$$\begin{aligned} W^{\ell +1}\ge D^{-1}(f- BW ^\ell ),\quad \ell =1,2,\ldots . \end{aligned}$$
(3.3)

Proof

It follows from Remark 3.1 that it only needs to prove the one-dimensional nonlinear equation (3.2) has a unique solution and \(W^{\ell +1}_i\ge b^\ell _i/a_{ii}\).

Consider the one-dimensional mapping \(h: R\rightarrow R\) defined by

$$\begin{aligned} h(t)=a_{ii}t-\lambda [g_i-t]_+^{1/k}-b^\ell _i. \end{aligned}$$

Since the matrix \(A\) is an M-matrix, we have \(a_{ii}>0\). Thus, it is not difficult to verify that the mapping \(h\) is strictly monotone on \(R\) and

$$\begin{aligned} \lim \limits _{t\rightarrow -\infty }h(t)=-\infty ,\quad \lim \limits _{t\rightarrow +\infty }h(t)=+\infty . \end{aligned}$$

Hence the equation \(h(t)=0\) has a unique solution. This implies that (3.2) has a unique solution. On the other hand, by (3.2) we have

$$\begin{aligned} W^{\ell +1}_i=\frac{b^\ell _i}{a_{ii}}+\frac{\lambda }{a_{ii}}\left[ g_i-W^{\ell +1}_i\right] _+^{1/k}\ge \frac{b^\ell _i}{a_{ii}}. \end{aligned}$$

This means that (3.3) holds. The proof is complete.\(\square \)

We now give a convergence theorem of Algorithm 3.1.

Theorem 3.1

Let \(W^1\) be a lower-solution of the penalized equation (1.1). Then the sequence \(\{W^\ell \}\) generated by Algorithm 3.1 is monotonically increasing and convergent to the solution of the penalized equation (1.1).

Proof

The proof is divided into two parts.

We first verify that for each \(\ell =1,2,\ldots \), it holds

$$\begin{aligned} W^\ell \le W^{\ell +1}\le W^k_\lambda , \end{aligned}$$
(3.4)

where \(W^k_\lambda \) is the solution of the penalized equation (1.1). Assume that \(W^\ell \) is a lower-solution. We shall prove that (3.4) holds for this \(W^\ell \). By Lemma 2.1 we only need to verify that \(W^{\ell +1}\) is also a lower-solution and \(W^\ell \le W^{\ell +1}\). Since \(W^\ell \) is a lower-solution, we have

$$\begin{aligned} AW ^\ell -\lambda [g-W^\ell ]^{1/k}_+\le f. \end{aligned}$$

Using the previous notations \(D, B\) and \(b^k\), it can be rewritten as

$$\begin{aligned} DW^\ell -\lambda [g-W^\ell ]^{1/k}_+\le f- BW ^\ell =b^k. \end{aligned}$$

This implies that for each \(i=1,2,\ldots ,n\), it holds

$$\begin{aligned} a_{ii}W^\ell _i-\lambda \left[ g_i-W^\ell _i\right] ^{1/k}_+\le b^k_i. \end{aligned}$$

Combining it with (3.2), we get

$$\begin{aligned} a_{ii}W^\ell _i-\lambda \left[ g_i-W^\ell _i\right] ^{1/k}_+\le a_{ii}W^{\ell +1}_i-\lambda \left[ g_i-W^{\ell +1}_i\right] ^{1/k}_+. \end{aligned}$$

Since \(a_{ii}>0\), it is readily to see that \(W^\ell _i\le W^{\ell +1}_i\). Thus, \(W^\ell \le W^{\ell +1}\). On the other hand, we have

$$\begin{aligned} AW ^{\ell +1}-\lambda [g-W^{\ell +1}]^{1/k}_+=DW^{\ell +1} -\lambda [g-W^{\ell +1}]^{1/k}_++ BW ^{\ell +1}. \end{aligned}$$

Noting that \(B\le 0\) and \(W^\ell \le W^{\ell +1}\), it follows from (3.1) that

$$\begin{aligned} AW ^{\ell +1}-\lambda [g-W^{\ell +1}]^{1/k}_+\le DW^{\ell +1}-\lambda [g-W^{\ell +1}]^{1/k}_++ BW ^\ell =f. \end{aligned}$$

This shows that \(W^{\ell +1}\) is also a lower-solution of the penalized equation (1.1). Since \(W^1\) is a lower-solution of the penalized equation (1.1), by the induction hypothesis we obtain (3.4) for each \(\ell \).

We now verify that the sequence \(\{W^\ell \}\) is convergent to the solution of the penalized equation (1.1). By (3.4), there exists some \(\bar{W}_\lambda ^k\) such that \(\lim \nolimits _{\ell \rightarrow +\infty }W^\ell =\bar{W}_\lambda ^k\). Letting \(\ell \rightarrow +\infty \) in (3.1), we get

$$\begin{aligned} D\bar{W}_\lambda ^k-\lambda \left[ g-\bar{W}_\lambda ^k\right] _{+}^{1/k} =f-B\bar{W}_\lambda ^k \end{aligned}$$

and hence

$$\begin{aligned} A\bar{W}_\lambda ^k-\lambda \left[ g-\bar{W}_\lambda ^k\right] _+^{1/k}=f. \end{aligned}$$

This means that the sequence \(\{W^\ell \}\) is convergent to the solution of the penalized equation (1.1). The proof is complete.\(\square \)

4 The implementation of Algorithm 3.1

In Sect. 3, it has been established that the sequence generated by Algorithm 3.1 is monotonically convergent to the solution of the penalized equation if the initial iterate is a lower-solution. One question is how to choose the initial iterate, which is a lower-solution of (1.1). From Lemma 2.3, we know that the solution of the \(l_1\) penalized equation is a lower-solution of the corresponding \(l_k\) penalized equation. Thus, we can obtain the initial iterate via solving the \(l_1\) penalized equation. The other question is how to solve the one-dimensional nonlinear equation (3.2) effectively. To answer this question, we shall give a detailed discussion in the following.

We consider the following one-dimensional nonlinear equation: find \(U\in \mathbb {R}\) such that

$$\begin{aligned} \alpha U-\lambda [\beta -U]_+^{1/k}=\gamma , \end{aligned}$$
(4.1)

where \(\alpha >0, \beta \in \mathbb {R}\) and \(\gamma \in \mathbb {R}\). Obviously, Eq. (4.1) is nonsmooth. Since \(\alpha >0\), by the discussion in Sect. 3 we know that (4.1) has a unique solution. In addition, \(U\ge \gamma /\alpha \) and the solution \(U\) satisfies the following equation

$$\begin{aligned} \left( U-\frac{\gamma }{\alpha }\right) ^k=\frac{\lambda ^k}{\alpha ^k}[\beta -U]_+. \end{aligned}$$

We shall study the analytical solution of Eq. (4.1) for the case where \(k=1\) and \(k=2\).

If \(\gamma /\alpha \ge \beta \), then \(U=\gamma /\alpha \). Otherwise, it holds

$$\begin{aligned} \left( U-\frac{\gamma }{\alpha }\right) ^k =\frac{\lambda ^k}{\alpha ^k}(\beta -U), \end{aligned}$$

which yields that

$$\begin{aligned} \left( U-\frac{\gamma }{\alpha }\right) ^k+\frac{\lambda ^k}{\alpha ^k}\left( U-\frac{\gamma }{\alpha }\right) +\frac{\lambda ^k}{\alpha ^k}\left( \frac{\gamma }{\alpha }-\beta \right) =0. \end{aligned}$$
(4.2)

Case I: \(k=1\). By a direct calculation, we get

$$\begin{aligned} U=\frac{\gamma +\beta \lambda }{\alpha +\lambda }. \end{aligned}$$

Since \(\gamma /\alpha <\beta \), it holds \(U>\gamma /\alpha \) and it satisfies (4.1).

Case II: \(k=2\). Since \(\gamma /\alpha <\beta \), we have \(\varDelta \triangleq \lambda ^2+4\alpha ^2\beta -4\alpha \gamma >\lambda ^2\) and Eq. (4.2) has the following two solutions:

$$\begin{aligned} U_1=\frac{\gamma }{\alpha }+\frac{\lambda \sqrt{\varDelta }-\lambda ^2}{2\alpha ^2},\quad U_2=\frac{\gamma }{\alpha }-\frac{\lambda \sqrt{\varDelta }+\lambda ^2}{2\alpha ^2}. \end{aligned}$$

Noting that \(U\ge \gamma /\alpha \) and \(U_2<\gamma /\alpha \), we have \(U=U_1\) or equivalently

$$\begin{aligned} U=\frac{\gamma }{\alpha }+\frac{2\lambda (\alpha \beta -\gamma )}{\alpha (\sqrt{\varDelta }+\lambda )} =\frac{\gamma }{\alpha }+\frac{2(\alpha \beta -\gamma )}{\alpha (\sqrt{\varDelta /\lambda ^2}+1)} =\frac{\gamma }{\alpha }+\frac{2(\alpha \beta -\gamma )}{\alpha (\sqrt{1+4 (\alpha ^2\beta -\alpha \gamma )/\lambda ^2}+1)}. \end{aligned}$$

From the above discussion, we know that the one-dimensional nonlinear Eq. (3.2) has a unique analytical solution when \(k=1\) and \(k=2\). In fact, when \(k=3\) and \(k=4\), the equation (3.2) also has a unique analytical solution [6]. But its expression is more complicated and the details are omitted here. When \(k\ge 4\), we may apply the Newton-like iteration to solving the Eq. (4.1).

Note that the sequence \(\{W^\ell \}\) generated by Algorithm 3.1 is monotonically increasing and convergent to the solution of the penalized equation (1.1). It follows from Theorem 2.3 that we can choose the tolerance \(\epsilon =1/\lambda ^k\) in Step 3 of Algorithm 3.1.

5 Applications to American option pricing problems

In this section, we shall do some preliminary numerical experiments to verify the theoretical results and compare the performance of the proposed method with that of the method proposed in [12].

Consider an American put option with strike price \(K>0\) and maturity time \(T>0\). If the option is exercised when the underlying asset price is \(S\), the option holder receives the payoff

$$\begin{aligned} \varPsi (S)=(K-S)^+=\max (K-S, 0). \end{aligned}$$

Let \(V(S,t)\) be the option value at time \(t\in [0, T]\) when the asset price is \(S\). Introducing a time-reverse transformation \(\tau =T-t, V(S,\tau )\triangleq V(S,T-t)\) satisfies the following form of the parabolic linear complementarity problem:

$$\begin{aligned}&V(S,\tau )\ge \varPsi (S),\nonumber \\&\displaystyle \frac{\partial V(S,\tau )}{\partial \tau }-{\fancyscript{L}}V(S,\tau )\ge 0,\nonumber \\&\displaystyle \left( \frac{\partial V(S,\tau )}{\partial \tau }-\fancyscript{L}V(S,\tau )\right) \cdot (V(S,\tau )-\varPsi (S))=0 \end{aligned}$$
(5.1)

a.e. in \(\varOmega \triangleq (0,+\infty )\times [0, T)\), with the initial condition

$$\begin{aligned} V(S,0)=\varPsi (S),\quad S\in (0,+\infty ) \end{aligned}$$

and the differential operator

$$\begin{aligned} \displaystyle \fancyscript{L}V(S,\tau )\triangleq \frac{1}{2}\sigma ^2S^2\frac{\partial ^2V(S,\tau )}{\partial S^2}+rS\frac{\partial V(S,\tau )}{\partial S}-rV(S,\tau ), \end{aligned}$$

where \(\sigma \) is the volatility of the underlying asset and \(r>0\) is the risk free interest rate. For computational purposes, problem (5.1) will be posed on the localized domain

$$\begin{aligned} (S, \tau )\in [0, S_\mathrm{max}]\times [0,T], \end{aligned}$$

where \(S_\mathrm{max}\) denotes a sufficiently large number to ensure the accuracy of the solution, and we impose the following boundary conditions \(V(0,\tau )=\varPsi (0)\) and \(V(S_\mathrm{max},\tau )=\varPsi (S_\mathrm{max})\) for \(\tau \in [0, T]\). From [12], the penalized form of (5.1) is

$$\begin{aligned} \frac{\partial V(S,\tau )}{\partial \tau }-\fancyscript{L}V(S,\tau )-\lambda [\varPsi (S)-V(S,\tau )]_+^{1/k}=0, \end{aligned}$$
(5.2)

where \(\lambda >0\) and \(k>0\).

5.1 Discretization

Following [1, 11, 12], we shall apply the finite volume discretization in space to (5.2). To the end, we let the interval \(I = (0, S_\mathrm{max})\) be divided into \(N\) subintervals,

$$\begin{aligned} I_i\triangleq [S_i,\ S_{i+1}],\quad i=0,1,\ldots ,N-1 \end{aligned}$$

with \(S_i=ih\) and \(h=S_\mathrm{max}/N\). We also let

$$\begin{aligned} S_{i-1/2}=(S_{i-1}+S_i)/2,\quad S_{i+1/2}=(S_i+S_{i+1})/2,\quad i=1,2,\ldots ,N-1. \end{aligned}$$

These midpoints form a second partition of \((0, S_\mathrm{max})\) if we define \(S_{-1/2}=x_0\) and \(S_{N+1/2}=x_N\). Integrating both sides of (5.2) over \((S_{i-1/2}, S_{i+1/2})\), we get

$$\begin{aligned}&\int _{S_{i-1/2}}^{S_{i+1/2}}\frac{\partial V(S,\tau )}{\partial \tau }dS-\left[ S\left( aS\frac{\partial V(S,\tau )}{\partial S}+bV(S,\tau )\right) \right] _{S_{i-1/2}}^{S_{i+1/2}}\\&\quad \displaystyle +\int _{S_{i-1/2}}^{S_{i+1/2}}cV(S,\tau )dS -\lambda \int _{S_{i-1/2}}^{S_{i+1/2}}[\varPsi (S)-V(S,\tau )]_+^{1/k}dS=0 \end{aligned}$$

for \(i=1,2,\ldots ,N-1\), where

$$\begin{aligned} a=\frac{1}{2}\sigma ^2,\quad b=r-\sigma ^2,\quad c=2r-\sigma ^2. \end{aligned}$$

Applying the midpoint quadrature rule to the first, third, and last terms, we obtain

$$\begin{aligned} h\frac{\partial V_i(\tau )}{\partial \tau }-\left[ S_{i+1/2}\rho (V)|_{S_{i+1/2}}-S_{i-1/2}\rho (V) |_{S_{i-1/2}}\right] +h\left[ cV_i(\tau )-\lambda [\varPsi _i -V_i(\tau )]_{+}^{1/k}\right] =0, \end{aligned}$$
(5.3)

where \(\varPsi _i=\varPsi (S_i), V_i(\tau )\) is the nodal approximation to \(V(S_i, \tau )\) to be determined, and \(\rho (V)\) is a flux associated with \(V\) defined by

$$\begin{aligned} \rho (V)\triangleq aS\frac{\partial V(S, \tau )}{\partial S}+bV(S, \tau ). \end{aligned}$$

To derive an approximation to the flux at the two endpoints \(S_{i-1/2}\) and \({S_{i+1/2}}\), let us consider the following two-point boundary-value problem

$$\begin{aligned} (aSu'+bu)'=0,\quad S\in I_i \end{aligned}$$

with

$$\begin{aligned} u(S_i)=V_i(\tau ),\quad u(S_{i+1})=V_{i+1}(\tau ). \end{aligned}$$

Denote \(\alpha \triangleq b/a=2(r/\sigma ^2-1)\). Solving the two-point boundary-value problem exactly, we obtain

$$\begin{aligned} \rho _i(V)=b\left[ S_{i+1}^{\alpha }V_{i+1}(\tau )-S_i^{\alpha } V_i(\tau )\right] /\left( S_{i+1}^{\alpha }-S_i^{\alpha }\right) ,\quad S\in I_i \end{aligned}$$
(5.4)

which provides an approximation to the flux \(\rho (V)\) at \({S_{i+1/2}}\). Similarly, we can define an approximation of the flux at \(S_{i-1/2}\).

The above analysis does not apply to the approximation of the flux on \((0, x_1)\) since the two-point boundary-value problem is degenerate. To overcome this difficulty, let us reconsider it with an extra degree of freedom in the following form

$$\begin{aligned} (aSu'+bu)'=C,\quad S\in (0, x_1) \end{aligned}$$

with

$$\begin{aligned} u(0)=V_0,\quad u(S_1)=V_1, \end{aligned}$$

where \(C\) is a constant to be determined. This local problem has the following solution [12]

$$\begin{aligned} u(S)=u_0+(u_1-u_0)S/S_1,\quad S\in I_0. \end{aligned}$$

which yields the following approximation to the flux \(\rho (V)\) in \(I_0\)

$$\begin{aligned} \rho _0(V)=\frac{1}{2}\left[ (a+b)V_1(\tau )-(a-b)V_0(\tau )\right] . \end{aligned}$$
(5.5)

Substituting (5.4) and (5.5) into (5.3), we have

$$\begin{aligned} h\frac{\partial V_i(\tau )}{\partial \tau }+e_{i,i-1}V_{i-1}(\tau )+e_{i,i}V_{i}(\tau ) +e_{i,i+1}V_{i+1}(\tau )-\lambda h[\varPsi _i-V_i(\tau )]_+^{1/k}=0, \end{aligned}$$
(5.6)

where

$$\begin{aligned} e_{1,0}= & {} -S_1(a-b)/4,\\ e_{1,1}= & {} S_1(a+b)/4+bS_{1+1/2}S_1^{\alpha }/\left( S_2^{\alpha }-S_1^{\alpha }\right) +ch,\\ e_{1,2}= & {} -bS_{1+1/2}S_2^{\alpha }/\left( S_2^{\alpha }-S_1^{\alpha }\right) ,\\ e_{i,i-1}= & {} -bS_{i-1/2}S_{i-1}^{\alpha }/\left( S_{i}^{\alpha }-S_{i-1}^{\alpha }\right) ,\\ e_{i,i}= & {} bS_{i-1/2}S_i^{\alpha }/\left( S_{i}^{\alpha }-S_{i-1}^{\alpha }\right) +bS_{i+1/2}S_i^{\alpha }/(S_{i+1}^{\alpha }-S_i^{\alpha })+ch,\\ e_{i,i+1}= & {} -bS_{i+1/2}S_{i+1}^{\alpha }/\left( S_{i+1}^{\alpha }-S_i^{\alpha }\right) ,\\ \end{aligned}$$

for \(i=2,3,\ldots ,N-1\). These form an \(N-1\) nonlinear ODE system for \(V(\tau )\triangleq (V_1(\tau ),\ldots , V_{N-1}(\tau ))^T\) with the homogeneous boundary condition \(V_0(\tau )=\varPsi (0)\) and \(V_N(\tau )=\varPsi (S_\mathrm{max})\). Using \(e_{i,j}\), we define an \((N-1)\times (N-1)\) tridiagonal matrix \(E\triangleq (e_{i,j})\) and an \((N-1)\)-dimensional vector

$$\begin{aligned} f(\tau )=(-e_{1,0}V_0(\tau ),0,\ldots ,0,-e_{N-1,N}V_N(\tau ))^T. \end{aligned}$$

Then (5.6) can be rewritten as

$$\begin{aligned} h\frac{\partial V(\tau )}{\partial \tau }+EV(\tau )-\lambda h[\varPsi -V(\tau )]_+^{1/k}=f(\tau ), \end{aligned}$$
(5.7)

where \(\varPsi =(\varPsi _1,\ldots ,\varPsi _{N-1})^T\). Equation (5.7) is a first order linear ODE system. To discretize it, we let \(\tau _i, i=0, 1, \ldots , M\), be a set of partition points in \([0, T]\) satisfying \(0=\tau _0\le \tau _1\le \ldots \le \tau _M=T\). Then, the application of the implicit timestepping scheme to (5.7) yields

$$\begin{aligned} h\frac{V^{m+1}-V^m}{\varDelta \tau _m}+EV^{m+1}-\lambda h[\varPsi -V^{m+1}]_+^{1/k}=f^{m+1}, \end{aligned}$$
(5.8)

where \(\varDelta \tau _m=\tau _{m+1}-\tau _m>0, V^{m+1}=V(\tau _{m+1})\) and \(f^{m+1}=V(\tau _{m+1})\). Letting

$$\begin{aligned} A=I+\frac{\varDelta \tau _m}{h}E,\quad b=\frac{\varDelta \tau _m}{h}f+V^m,\quad \lambda =\lambda \varDelta \tau _m, \end{aligned}$$

where \(I\) is an identity matrix, (5.8) reduces to (1.1). By [12], the matrix \(A\) is an M-matrix.

5.2 Numerical experiments

In this subsection, we set the parameters as follows:

$$\begin{aligned} K=100,\quad T=0.25,\quad r=0.1,\quad \sigma =0.2,\quad S_\mathrm{max}=1000. \end{aligned}$$

We first verify the monotone convergence of the power penalty method and compare the orders of convergence rate for \(k=1\) and \(k=2\). When \(k=1\), we call the method the \(l_1\) penalty method. When \(k=2\), we call the method the \(l_{1/2}\) penalty method. In this experiment, we choose the grid step \(h=0.3125\). The corresponding node number is \(3201\), that is, \(N=3200\). To determine numerically the computational orders of convergence rate of the penalty method with respect to \(\lambda \), we choose a sequence of \(\lambda _p\) with \(\lambda _p=2\lambda _{p-1}\) (\(p=2, 3, 4, 5, 6\)) and some given \(\lambda _1>1\). Table 1 (resp. Table 2) lists the value of American put options at \(S=K, t=0\) or \(\tau =T\) [denoted by \(V_p(100,0)\)] and \(\Vert [g-V^n]_+\Vert _\infty \) at \(t=0\) [denoted by \(\mathrm{norm}_p([g-V]_+\))] for each \(p\) and \(k=1\) (resp. \(k=2\)). We denote \(\varDelta _p V=V_p(100,0)-V_{p-1}(100,0)\), and

$$\begin{aligned} \mathrm{Ratio1}_p=\frac{\varDelta _{p-1}V}{\varDelta _pV},\quad \mathrm{Ratio2}_p=\frac{\mathrm{norm}_{p-1}([g-V]_+)}{\mathrm{norm}_p([g-V]_+)}. \end{aligned}$$
Table 1 Computed results by the \(l_1\) penalty method.
Table 2 Computed results by the \(l_{1/2}\) penalty method

From Tables 1 and 2, we see that the computed rates of convergence is about 1 for the \(l_1\) penalty method and about 2 for the \(l_{1/2}\) penalty method. That is,

$$\begin{aligned} \mathrm{Ratio1}_p\approx \left( \frac{\lambda _p}{\lambda _{p-1}}\right) ^k\quad \mathrm{and}\quad \mathrm{Ratio2}_p\approx \left( \frac{\lambda _p}{\lambda _{p-1}}\right) ^k,\quad p=2,3,\ldots ,7. \end{aligned}$$

Moreover, the \(l_{1/2}\) penalty method generates a more accurate approximation to the true solution than the \(l_1\) penalty method does, which is well matched with the theoretical results obtained in Sect. 2. From Tables 1 and 2, we also see that the value of American option at \(S=K, t=0\) is monotonically increasing as the \(\lambda \) is increasing, which shows the monotone convergence of the methods.

In the second experiment, we compare the computational performance of smoothing Newton method proposed in [12] and Algorithm 3.1 when applied to solve the \(l_{1/2}\) penalized equation. Note that the solution of the \(l_1\) penalized equation with the same penalty parameter \(\lambda \) is a lower-solution of the \(l_{1/2}\) penalized equation. We choose it as an initial iterate for both algorithms. We consider the \(l_{1/2}\) penalized equation with \(N=200, 400, 800, 1600\). We set the penalty parameter \(\lambda =4\) for these penalized equations. Table 3 lists the average number of iterations (denoted by Iter) of two algorithms per timestep and the total CPU time (denoted by CPU). Besides these, the value of American put options at \(S=K, t=0\) [denoted by \(V(100,0)\)] is listed in Table 3.

Table 3 Comparison of computational cost for smoothing Newton method and Algorithm 3.1

Table 3 shows that the value of American put options at \(S=K, t=0\) solved by both algorithms is almost the same. But the smoothing Newton method requires much more CPU time than Algorithm 3.1 does. Therefore, the proposed method is much more efficient than the smoothing Newton method.