1 Introduction

Variational inequalities, as a class of important nonlinear problems, are frequently encountered in various applied sciences, including physical, engineering, financial, and management sciences. In the past four decades, their mathematical theories have been established extensively and thoroughly (cf. [7, 25, 27, 34]). Moreover, many numerical methods have been developed for such nonlinear variational problems, for instance, the conforming element method (cf. [6, 16, 26, 28, 29]), the nonconforming element method (cf. [33]), the mixed element method (cf. [13, 17, 18, 31]), the finite volume method (cf. [42]), and the discontinuous Galerkin method (cf. [38]).

A new family of numerical methods, known as virtual element methods (VEMs), is gaining popularity in the area of numerical solution of differential equations. VEMs are advantageous in handling complex geometries and in solving problems with high-order differential equations. The methods were introduced in [2, 8, 10]. More recent theoretical developments of VEMs can be found in [11, 14, 19]. VEMs have been applied to solve a variety of problems in solid mechanics (e.g., [4, 5, 9]), fluid mechanics (e.g., [3, 12]), and so on. In [39], VEMs are applied to solve the obstacle problem, which is an elliptic variational inequality (EVI) of the first kind. In [40], VEMs are applied to solve a simplified friction problem, which is an EVI of the second kind. In this paper, we study VEMs to solve EVIs of the second kind. Unlike [40], the VEMs developed in this paper are directly implementable.

Motivated by the construction of VEMs [2, 8, 10], we first propose in this paper a general framework for the numerical solution of EVIs of the second kind. Under some reasonable assumptions, we provide error analysis for the numerical solution, including the influence of numerical integration of the nonsmooth term which is very important in actual computation. As examples of applications of this general framework, we construct and analyze VEMs for solving the simplified friction problem and the frictional contact problem.

Furthermore, we discuss how to apply a recently developed algorithm (cf. [41]), which is superlinearly convergent, to solve discrete problems arising from the VEM discretization of the previous two problems. A series of numerical results are also reported to illustrate computational performance of the VEMs proposed in this paper.

Finally, it should be emphasized that our general framework can be used for the numerical solution and error analysis of any EVI of the second kind, whenever the VEM related to the linear problem of the standard form

$$\begin{aligned} u\in V, \quad a(u,v)=f(v)\quad \forall \,v\in V \end{aligned}$$

is available.

The rest of this paper is organized as follows. An abstract method and its error analysis are given in Sect. 2. The two virtual element methods and their error estimates are developed in Sect. 3 for the simplified friction problem and the frictional contact problem, respectively. In Sect. 4, a detailed discussion is presented on solving the discrete problems in Sect. 3, based on a regularized semi-smooth Newton method with projection steps mentioned in [41]. Finally, several numerical results are presented in Sect. 5 to illustrate the performance of the VEMs proposed in this paper.

2 A General Framework for Numerical Solution of EVIs of the Second Kind

Let \(\Omega \subset \mathbb {R}^{2}\) be a bounded polygon with the Lipschitz boundary \(\partial \Omega \). Let V be a Hilbert space consisting of certain functions defined over \(\Omega \), which is equipped with the norm \(\Vert \cdot \Vert _V\). We use \(V'\) to denote the dual space of V and write \(\langle \cdot ,\cdot \rangle \) for the duality pairing between \(V'\) and V. Let \(a(\cdot ,\cdot ): V\times V\rightarrow \mathbb {R}\) be a continuous, coercive and symmetric bilinear form, \(j(\cdot ): V\rightarrow \mathbb {\overline{R}}=\mathbb {R}\cup \{\pm \infty \}\) a proper, convex and lower semi-continuous functional, and \(f(\cdot ): V\rightarrow \mathbb {R}\) a continuous functional. For \(v\in V\), we also write \(f(v)=\langle f, v\rangle \). An elliptic variational inequality (EVI) of the second kind can be described as follows ( [28]).

Problem 2.1

Find \(u\in V\) such that

$$\begin{aligned} a(u,v-u)+j(v)-j(u)\ge \langle f,v-u\rangle \quad \forall \, v \in V. \end{aligned}$$
(2.1)

This problem has a unique solution under the stated conditions on the problem data (cf. [28]). Let us introduce a general framework on the numerical solution of Problem 2.1. Let \({\mathcal T}_h:=\{K\}_{K\in {\mathcal T}_h}\) be a mesh of \(\Omega \) into polygons, with each generic element denoted by K; \(h:=\max _{K\in {\mathcal T}_h} h_K\) and \(h_K:=\text{ diam }(K)\). With this mesh, we associate a finite dimensional subspace \(V_h\) of V. For a non-negative integer m and an element \(K\in {\mathcal T}_h\), denote by \(\mathbb {P}_m(K)\) the set of all polynomials on K with the total degree no more than m. Moreover, we assume that the bilinear form \(a(\cdot ,\cdot )\) can be decomposed as

$$\begin{aligned} a(v,w)=\sum _{K\in {\mathcal T}_h} a^K(v,w),\quad v,w\in V, \end{aligned}$$

where \(a^K(\cdot ,\cdot )\) is a bilinear form over \(V_K:=V_{|K}\). For a function in V, we naturally view it as a function in \(V_K\) by its restriction to K. We equip the Hilbert space \(V_{|K}\) with a norm or semi-norm \(\Vert \cdot \Vert _{V,K}\) such that

$$\begin{aligned} \Vert v\Vert _V^2=\sum _{K\in {\mathcal T}_h} \Vert v\Vert _{V,K}^2\quad \forall \, v\in V, \end{aligned}$$
(2.2)

and for all \(K\in {\mathcal T}_h\) there holds

$$\begin{aligned} a^K(v,v)\lesssim \Vert v\Vert _{V,K}^2\quad \forall \, v\in V_K, \end{aligned}$$
(2.3)

where and in what follows, for any two quantities a and b, “\(a\lesssim b\)” stands for “\(a\le C b\)”, with C being a generic constant independent of \(h_K\) or h, which may take different values at different occurrences. Assume \(f_h\) is an approximation of f, and the bilinear form \(a_h(\cdot ,\cdot )\) is obtained by

$$\begin{aligned} a_h(u,v):=\sum _{K\in {\mathcal T}_h}a_h^{K}(u,v), \end{aligned}$$

where the symmetric bilinear form \(a_h^{K}(\cdot ,\cdot )\) has the following properties:

  • k-Consistency: For all \(p\in \mathbb {P}_k(K)\) and for all \(v_h \in V_{h|K},\)

    $$\begin{aligned} a_h^K(p,v_h)=a^K(p,v_h), \end{aligned}$$
    (2.4)

    where k is a given natural number.

  • Stability: There exist two positive constants \(\alpha _{\star }\) and \(\alpha ^{\star }\), independent of \(h_K\) and K, such that

    $$\begin{aligned} \alpha _{\star }a^K(v_h,v_h)\le a_h^K(v_h,v_h)\le \alpha ^{\star } a^K(v_h,v_h)\quad \forall \,v_h\in V_{h|K}. \end{aligned}$$
    (2.5)

Our general numerical method for Problem 2.1 is as follows.

Problem 2.2

Find \(u_h\in V_h\) such that

$$\begin{aligned} a_h(u_h,v_h-u_h)+j(v_h)-j(u_h)\ge \langle f_h,v_h-u_h\rangle \quad \forall \, v_h \in V_h. \end{aligned}$$
(2.6)

Remark 2.1

The construction of \(a_h(\cdot ,\cdot )\) is motivated by the ideas of the VEM introduced in [2, 8, 10]. Here, we first propose a general framework for the numerical solution of Problem 2.1. Based on this general framework, we can then easily devise and analyze the virtual element method for solving several concrete EVIs of the second kind in a unified way. For details, see Sect. 3.

Remark 2.2

Throughout this paper, we use the standard notation and symbols for Sobolev spaces and their norms and seminorms. For details, see [1]. As a typical example of the above general framework, let \(V:=H^1(\Omega )\) with the norm \(\Vert \cdot \Vert _V:=\Vert \cdot \Vert _{1,\Omega }\). Choose

$$\begin{aligned} a(v,w):=\int _{\Omega }\left( \nabla v\cdot \nabla w+vw\right) dx. \end{aligned}$$

Then \(V_K=H^1(K)\) equipped with the norm \(\Vert \cdot \Vert _{V,K}=\Vert \cdot \Vert _{1,K}\), and

$$\begin{aligned} a^K(v,w):=\int _{K}\left( \nabla v\cdot \nabla w+vw\right) dx. \end{aligned}$$

In order to derive error estimates for the method (2.6), we make two assumptions as follows.

Assumption A1. For each \(K\in \mathcal{T}_h\), for every \(v\in H^{k+1}(K)\), there exists a function \(v_\pi \in \mathbb P_k(K)\) such that

$$\begin{aligned} \Vert v-v_\pi \Vert _{0,K}+h_K\Vert v-v_\pi \Vert _{V,K}\lesssim h_K^{k + 1}|v|_{k+1,K},\quad v\in H^{k+1}(K). \end{aligned}$$
(2.7)

Assumption A2. For each \(K\in \mathcal{T}_h\), there exists an interpolation operator \(I_K: H^{k+1}(K)\rightarrow V_{h|K}\) satisfying the following estimate:

$$\begin{aligned} \Vert v-I_Kv\Vert _{0,K}+h_K\Vert v- I_Kv\Vert _{V,K}\lesssim h_K^{k + 1}|v|_{k+1,K},\quad v\in H^{k+1}(K). \end{aligned}$$
(2.8)

Moreover, \(v_I\in V_h\) if v is additionally in V, where \(v_I\) is defined piecewise by \(v_I=I_K v\) on K, for each \(K\in \mathcal{T}_h\).

For later uses, we always write \(v_I\) as the global interpolant of v whenever \(v\in H^{k+1}(\Omega )\).

The following result can be viewed as Céa’s lemma for the numerical solution defined by Problem 2.2 for solving 2.1 (cf. [15, 22]), which is useful in our further error estimates for the virtual element method of Problem 2.1.

Theorem 2.1

If Assumptions A1A2, (2.4) and (2.5) hold and the true solution u of Problem 2.1 belongs to \(H^{k+1}(\Omega )\) for some natural number k, then there holds

$$\begin{aligned} \Vert u-u_h\Vert _{V}&\le \Vert u-u_{I}\Vert _{V}+\Vert u_{I}-u_h\Vert _{V}\\&\lesssim h^k|u|_{k+1,\Omega }+\Vert f-f_h\Vert _{V'_h}+|R_1(u,u_{I})|^{1/2}, \end{aligned}$$

where

$$\begin{aligned} \Vert f-f_h\Vert _{V'_h}&:=\sup _{v_h\in V_h}\frac{\langle f-f_h,v_h\rangle }{\Vert v_h\Vert _V},\\ R_1(u,u_{I})&: = a(u,u_{I}- u) + j(u_I) - j(u) - \left\langle f,u_I- u \right\rangle . \end{aligned}$$

Proof

Denote \(w=u_{I}-u_h\). By the coerciveness of \(a(\cdot ,\cdot )\) and the stability condition for \(a_h^K(\cdot ,\cdot )\), there exists a positive constant \(\alpha \) independent of h such that

$$\begin{aligned} \alpha \Vert w\Vert _V^2\le a_h(u_I,w)- a_h(u_h,w). \end{aligned}$$
(2.9)

Because of the k-consistency (2.4),

$$\begin{aligned} a_h^K(u_\pi ,v_h)=a^K(u_\pi ,v_h) \quad \forall \, v_h\in V_{h|K}, \end{aligned}$$

and we can write

$$\begin{aligned} a_h(u_{I},w)=\sum _{K\in {\mathcal T}_h}\big (a_h^K(I_Ku-u_\pi ,w)+a^K(u_\pi - u,w) \big ) +a(u,w). \end{aligned}$$
(2.10)

Observe that

$$\begin{aligned} a(u,w)= & {} a(u,u_{I}-u)+a(u,u-u_h)\nonumber \\\le & {} a(u,u_{I}-u)+j(u_h)-j(u)-\langle f,u_h-u\rangle . \end{aligned}$$
(2.11)

From the discrete variational inequality (2.6),

$$\begin{aligned} a_h(u_h,w)\ge j(u_h)-j(u_{I})+\langle f_h,u_I-u_h\rangle . \end{aligned}$$
(2.12)

Using (2.10), (2.11) and (2.12) in (2.9), we obtain

$$\begin{aligned} \alpha \Vert w\Vert _V^2\le \sum _{K\in {\mathcal T}_h}\big (a_h^K(I_Ku-u_\pi ,w)+a^K(u_\pi - u,w) \big ) +\langle f-f_h,w\rangle +R_1(u,u_{I}).\nonumber \\ \end{aligned}$$
(2.13)

It follows from the Cauchy–Schwarz inequality, (2.5) and (2.3) that

$$\begin{aligned}&\sum _{K\in {\mathcal T}_h}\big (a_h^K(I_Ku-u_\pi ,w)+a^K(u_\pi - u,w) \big )\\&\quad \lesssim \left( \sum _{K\in {\mathcal T}_h}\Vert I_Ku-u_\pi \Vert _{V,K}^2 +\sum _{K\in {\mathcal T}_h}\Vert u-u_\pi \Vert _{V,K}^2\right) ^{1/2}\Vert w\Vert _{V}. \end{aligned}$$

So from (2.13), we have

$$\begin{aligned} \Vert w\Vert _V^2\lesssim & {} \left[ \left( \sum _{K\in {\mathcal T}_h}\Vert I_Ku-u_\pi \Vert _{V,K}^2 +\sum _{K\in {\mathcal T}_h}\Vert u-u_\pi \Vert _{V,K}^2\right) ^{1/2}+\Vert f-f_h\Vert _{V'_h}\right] \,\Vert w\Vert _V\nonumber \\&+ |R_1(u,u_{I})|. \end{aligned}$$
(2.14)

Applying the elementary result:

$$\begin{aligned} a,b,x\ge 0\ \mathrm{and}\ x^2\le a\,x+b\quad \Longrightarrow \quad x^2\le a^2+2\,b, \end{aligned}$$

we find from (2.14) that

$$\begin{aligned} \Vert w \Vert _V^2\lesssim & {} \sum _{K\in {\mathcal T}_h}\Vert I_Ku-u_\pi \Vert _{V,K}^2 +\sum _{K\in {\mathcal T}_h}\Vert u-u_\pi \Vert _{V,K}^2\nonumber \\&+ \,\Vert f-f_h\Vert _{V'_h}^2 + |R_1(u,u_{I})|. \end{aligned}$$
(2.15)

Furthermore, we have by the estimates (2.7), (2.8) and the relation (2.2) that

$$\begin{aligned}&\left( \sum _{K\in {\mathcal T}_h}\Vert u-u_\pi \Vert _{V,K}^2\right) ^{1/2} \lesssim h^{k}|u|_{k+1,\Omega },\\&\Vert u-u_{I}\Vert _{V}=\left( \sum _{K\in {\mathcal T}_h}\Vert u-I_Ku\Vert _{V,K}^2\right) ^{1/2} \lesssim h^{k}|u|_{k+1,\Omega }. \end{aligned}$$

Noting that

$$\begin{aligned} \Vert u-u_h\Vert _{V}\le \Vert u-u_{I}\Vert _{V}+\Vert u_{I}-u_h\Vert _{V}, \end{aligned}$$

we can obtain readily the desired result by using the last two inequalities and the estimate (2.15). \(\square \)

In actual computation, we often require to carry out numerical integration for the quantity \(j(v_h)\). In other words, \(j(v_h)\) is replaced by another quantity \(j_h(v_h)\). So instead of Problem 2.2, we actually solve the following problem.

Problem 2.3

Find \(u_h\in V_h\) such that

$$\begin{aligned} a_h(u_h,v_h-u_h)+j_h(v_h)-j_h(u_h)\ge \langle f_h,v_h-u_h\rangle \quad \forall \, v_h \in V_h. \end{aligned}$$
(2.16)

In this case, using the similar argument for deriving the above theorem, we can achieve the related error estimate under the following condition:

$$\begin{aligned} j_h(v_h)\ge j(v_h),\quad v_h\in V_h. \end{aligned}$$
(2.17)

Theorem 2.2

Assume A1A2, (2.4), (2.5) and (2.17) hold true, and the true solution u of (2.1) belongs to \(H^{k+1}(\Omega )\) for some natural number k. Let \(u_h\) be the solution of Problem 2.3. Then

$$\begin{aligned} \Vert u-u_h\Vert _{V}\lesssim h^k|u|_{k+1,\Omega }+\Vert f-f_h\Vert _{V'_h}+|R(u,u_{I})|^{1/2}, \end{aligned}$$
(2.18)

where

$$\begin{aligned} R(u,u_{I}): = a(u,u_{I}- u) + j_h(u_{I}) - j(u) - \left\langle f,u_{I}- u \right\rangle . \end{aligned}$$
(2.19)

Proof

Take \(v_h=u_I\) in (2.16),

$$\begin{aligned} a_h(u_h,u_I-u_h)+j_h(u_I)-j_h(u_h)\ge \langle f_h,u_I-u_h\rangle , \end{aligned}$$
(2.20)

which implies

$$\begin{aligned} a_h(u_h,w)\ge j_h(u_h)-j_h(u_{I})+\langle f_h,u_I-u_h\rangle , \end{aligned}$$

where \(w:=u_{I}-u_h\). Arguing as in the derivation of (2.15), we have

$$\begin{aligned} \Vert w\Vert _V^2\lesssim & {} \sum _{K\in {\mathcal T}_h}\Vert I_Ku-u_\pi \Vert _{V,K}^2 +\sum _{K\in {\mathcal T}_h}\Vert u-u_\pi \Vert _{V,K}^2\nonumber \\&+\, \Vert f-f_h\Vert _{V'_h}^2 + |R(u,u_{I})|+j(u_h)-j_h(u_h). \end{aligned}$$
(2.21)

Finally, noting the assumption (2.17) and following the same arguments after (2.15) in the proof of Theorem 2.1, we are able to derive the required result. \(\square \)

3 Virtual Element Methods for EVIs of Second Kind

In this section, we intend to design and analyze some virtual element methods for solving two EVI problems, based on the framework in the last section. The partition \(\mathcal{T}_h=\left\{ K\right\} _{K\in \mathcal{T}_h}\) of the domain \(\Omega \) has been introduced in Sect. 2. Following [19], we make the following assumption for the mesh \({\mathcal T}_h\).

Assumption A. For each \(K\in {\mathcal T}_h\), there exists a “virtual triangulation” \({\mathcal T}_K\) of K such that \({\mathcal T}_K\) is uniformly shape regular and quasi-uniform. The corresponding mesh size of \({\mathcal T}_K\) is bounded below by a constant multiple of \(h_K\). Each edge of K is a side of a triangle in \({\mathcal T}_K\).

It is evident to check that the above assumption covers the usual conditions satisfied by \(K\in {\mathcal T}_h\), given as follows (cf. [2, 8, 10]).

C1 :

There exists a real number \(\gamma >0\) such that for each element \(K\in \mathcal {T}_h\), it is star-shaped with respect to a disk of radius \(\rho _K\ge \gamma h_K\).

C2 :

There exists a real number \(\gamma _1>0\) such that for each element \(K\in \mathcal {T}_h\), the distance between any two vertices of K is \(\ge \gamma _1 h_K\).

We comment that under the Assumption A, it is easy to derive the estimate (2.7) using the classical Scott–Dupont theory in the case \(V=H^1(\Omega )\) (cf. [15]), and if the virtual element space \(V_h\) is taken as those given in [2, 8, 10], the estimate (2.8) always holds by choosing \(I_K\) as the related nodal interpolation operator (cf. [14, 19]).

Note that in general we can not achieve optimal error estimates for variational inequalities with high order elements due to the lack of sufficient solution regularity and the inequality feature of the problem. Hence, we restrict our attention to the lowest order virtual method introduced in [8]. From now on, we always assume the mesh \({\mathcal T}_h\) satisfies the Assumption A.

3.1 The Simplified Friction Problem

Let \(\Omega \) be a bounded polygonal domain of \(\mathbb {R}^2, f\in L^2(\Omega )\) and \(g>0\) a constant. Decompose the boundary \(\partial \Omega \) into two subsets \(\Gamma _D\) and \(\Gamma _C\), which is aligned with the mesh \({\mathcal T}_h\). Then, the boundary value problem related to a simplified friction problem can be expressed as (cf. [28])

$$\begin{aligned} \left\{ \begin{aligned}&-\Delta u+u=f\quad {\mathrm{in}}\ \Omega ,\\&|\partial _{\varvec{\nu }}u|\le g,~u\partial _{\varvec{\nu }}u+g|u|=0\quad {\mathrm{on}}\ \Gamma _C,\\&u=0\quad {\mathrm{on}}\ \partial \Omega \backslash \Gamma _C . \\ \end{aligned} \right. \end{aligned}$$
(3.1)

Define \(V=\{v\in H^1(\Omega )\mid v=0\ {\mathrm{on}}\ \Gamma _D\}\) equipped with the norm \(\Vert \cdot \Vert _V=\Vert \cdot \Vert _{1,\Omega }\). Then, the variational inequality related to (3.1) reads

$$\begin{aligned} u \in V,\quad a(u,v-u)+j(v)-j(u)\ge \langle f,v-u\rangle \quad \forall \, v \in V, \end{aligned}$$
(3.2)

where

$$\begin{aligned} a(u,v):= \int _\Omega \left( \nabla u \cdot \nabla v + uv\right) dx,\quad f(v):= \int _\Omega fv\,dx, \quad j(v) := \int _{\Gamma _C} g\left| v \right| ds. \end{aligned}$$

Now, let us define a virtual element space of a lowest order (\(k=1\)) as follows. For all \(K\in {\mathcal T}_h\), as shown in [8],

$$\begin{aligned} V_1(K) :=\left\{ v\in H^1(K)\mid \Delta v=0\ \mathrm{in}\ K,\ v|_{\partial K}\in C(\partial K), \,v|_e\in P_1(e) ~{\mathrm{for~ each}}~ e\in \partial K\right\} ,\nonumber \\ \end{aligned}$$
(3.3)

with the function values at the vertices of K as a set of degrees of freedom. Then the virtual element space \(V_h\) is defined by

$$\begin{aligned} V_h=\{v\in C(\overline{\Omega })\mid v|_K\in V_1(K) \ \forall \, K \in {\mathcal T}_h\}\cap V. \end{aligned}$$

Next, introduce a local \(H^1\) projection \(\Pi _1^\nabla :{H^1}(K)\rightarrow \mathbb {P}_1(K)\) as follows. For \(v\in H^1(K)\),

$$\begin{aligned} \left\{ \begin{aligned}&(\nabla \Pi _1^\nabla v,\nabla p)_K=(\nabla v,\nabla p)_K\quad \forall \, p\in \mathbb {P}_1(K),\\&\overline{\Pi _1^\nabla v}=\overline{v}, \end{aligned} \right. \end{aligned}$$

where \((\cdot ,\cdot )_K\) is the \(L^2(K)\) inner product, and \(\overline{v}\) is the integral average of v on the boundary \(\partial K\) of K. To simplify the presentation, we also use \(\Pi _1^\nabla \) to represent the related element-wise defined global operator.

In addition, set

$$\begin{aligned} a_h(u,v):&= (\nabla \Pi _1^\nabla u,\nabla \Pi _1^\nabla v)+(\Pi _1^\nabla u,\Pi _1^\nabla v) + \sum \limits _{K \in {\mathcal T}_h} {S_K}(u,v), \end{aligned}$$
(3.4)
$$\begin{aligned} S_K(u,v):&=\sum _{i=1}^{n^K}(1+h^2_K)\chi _i^K((I-\Pi _1^\nabla )u)\cdot \chi _i^K((I-\Pi _1^\nabla )v), \end{aligned}$$
(3.5)

where \((\cdot ,\cdot )\) denotes the usual \(L^2(\Omega )\) inner product, \(n^K\) denotes the number of vertices of K, while \(\{\chi _i^K(\cdot )\}\) are the function values at the vertices of K. It is routine to examine that the third term in (3.4) can be assembled as a diagonal bilinear form in terms of the degrees of freedom \(\{\chi _i(\cdot )\}\) of \(V_h\).

On the other hand, we choose the approximation \(f_h\) of f such that (cf. [19])

$$\begin{aligned} \langle {f_h},{v_h}\rangle := \sum _{K\in \mathcal T_h}\int _Kf\cdot \Pi _1^\nabla v_hdx,\quad v_h\in V_h. \end{aligned}$$
(3.6)

It is easy to see that this quantity can be expressed in terms of the degrees of freedom of \(V_h\). Moreover, we have by the error estimate for the operator \(\Pi _1^\nabla \) (cf. [19]) and the Cauchy–Schwarz inequality that

$$\begin{aligned} |\langle f-f_h,v_h\rangle |&=\Big |\sum _{K\in \mathcal T_h}\int _K f\cdot (v_h-\Pi _1^\nabla v_h)dx\Big | \le \sum _{K\in \mathcal T_h} \Vert f\Vert _{0,K}\Vert v_h-\Pi _1^\nabla v_h\Vert _{0,K}\\&\lesssim \sum _{K\in \mathcal T_h} \Vert f\Vert _{0,K}h_K|v_h|_{1,K}\lesssim h \Vert f\Vert _{0,\Omega } \Vert v_h\Vert _V, \end{aligned}$$

i.e.,

$$\begin{aligned} \Vert f-f_h\Vert _{V'_h} \lesssim h. \end{aligned}$$
(3.7)

As for the nonlinear functional \(j(\cdot )\), we use the trapezoidal rule to carry out numerical integration at each edge of \({\mathcal T}_h\) on \(\Gamma _C\), to get its approximation \(j_h(\cdot )\). In detail, let \(\{P_i\}_{i=1}^{m_0}\) be the nodes of \(\mathcal T_h\) on \(\Gamma _C\), which are numbered such that each line segment \(e_i:=\overline{P_iP_{i+1}}\), with length \(h_i\), is an edge of a polygon \(K\in \mathcal T_h\), lying on \(\Gamma _C\). Then, for all \(v_h\in V_h\),

$$\begin{aligned} j_h(v_h):=g\sum _{i=1}^{m_0-1}\frac{h_i}{2} (|v_h(P_i)|+|v_h(P_{i+1})|). \end{aligned}$$
(3.8)

Now, we are ready to present a VEM for solving (3.2), described as follows.

$$\begin{aligned} u_h \in V_h,\quad a_h(u_h,v_h-u_h)+j_h(v_h)-j_h(u_h)\ge \langle f_h,v_h-u_h\rangle \quad \forall \,v_h\in V_h. \end{aligned}$$
(3.9)

Theorem 3.1

Assume that \(f\in L^2(\Omega )\), and the exact solution u of problem (3.2) belong to \(H^2(\Omega )\) and \({u}_{|\Gamma _i}\in H^2(\Gamma _i)\), with \(\{\Gamma _i\}_{i=1}^{n_0}\) being the edges of \(\partial \Omega \) on \(\Gamma _C\). Moreover, u changes its sign only finite many times on \(\Gamma _C\). Let \(a_h(\cdot ,\cdot )\), \(f_h(\cdot )\) and \(j_h(\cdot )\) be given by (3.4), (3.6) and (3.8), respectively. Then

$$\begin{aligned} \Vert u-u_h\Vert _{V}\lesssim h, \end{aligned}$$

where \(u_h\) is the solution of the method (3.9).

Proof

It is shown in [2, 8, 19] that the above bilinear form \(a_h(\cdot ,\cdot )\) satisfies conditions (2.4) and (2.5). Since the mesh satisfies the Assumption A, Assumptions A1 and \(\mathbf{A2}\) naturally hold (cf. [14, 15, 19]). On the other hand, it is easy to check that

$$\begin{aligned} j_h(v_h)\ge j(v_h)\quad \forall \, v_h\in V_h. \end{aligned}$$

Hence, all the assumptions in Theorem 2.2 hold, and we are able to bound the error of \(u_h\) by means of estimates (2.18)–(2.19).

On the other hand, recalling the construction of the nodal interpolation operator \(I_K\) (cf. [14, 19]), \(u_I\) is equal to the usual continuous \(\mathbb {P}_1\) interpolant on \(\Gamma _C\). Then, under the assumptions of u, following the similar argument for deriving [30, Theorem 3.2] and using the error estimate for the interpolation operator for Lagrange elements (cf. [15, 22]), we find

$$\begin{aligned} |j_h(u_I)-j(u)|\lesssim h^2\sum _{i=1}^{n_0}\Vert u\Vert _{W^{1,\infty }(\Gamma _i)} +\Vert u-u_I\Vert _{L^1(\Gamma _C)}\lesssim h^2 \sum _{i=1}^{n_0} \Vert u\Vert _{2,\Gamma _i}. \end{aligned}$$
(3.10)

By integration by parts, using the first equation in (3.1), the property \(u_I-u=0\) on \(\partial \Omega \backslash \Gamma _C\) and the Cauchy–Schwarz inequality, we have

$$\begin{aligned} |a(u,u_I-u)-\langle f,u_I-u\rangle |\le \Vert \partial _{\varvec{\nu }} u\Vert _{0,\Gamma _C}\Vert u-u_I\Vert _{0,\Gamma _C}. \end{aligned}$$

Furthermore, invoking the error estimates for the nodal interpolation operator of Lagrange elements on \(\Gamma _C\), we find

$$\begin{aligned} |a(u,u_I-u)-\langle f,u_I-u\rangle |\lesssim h^2. \end{aligned}$$
(3.11)

Inserting the estimates (3.7), (3.10) and (3.11) into (2.18) and (2.19), we obtain the desired result. \(\square \)

Remark 3.1

It deserves to point out some difference between our method (3.9) and the method studied in [40] to solve the simplified friction problem. For our method, the discrete right-hand side is computable with respect to the degrees of freedom of the VEM space. Moreover, the nonsmooth term \(j(\cdot )\) is approximated through numerical integration so that the discrete problem can be solved efficiently by a fast algorithm developed in Sect. 4 while the proposed numerical method maintains the optimal convergence order.

3.2 The Frictional Contact Problem

In this subsection, we consider the deformation of an elastic body occupying a bounded domain \(\Omega \) with a Lipschitz boundary \(\Gamma \), with the physical setting depicted as in Fig. 1.

Fig. 1
figure 1

Physical setting of the frictional contact problem

In this model, \(\mathbb {C}:\mathbb {S}^2 \rightarrow \mathbb {S}^2\) is the fourth order elasticity tensor, which is bounded, symmetric and positive definite in \(\Omega \). Here, \(\mathbb {S}^2\) denotes the set of all symmetric second order tensors in \(\mathbb {R}^2\). The boundary is made up of \(\Gamma _D\), \(\Gamma _N\) and \(\Gamma _C\), where \(\mathrm{meas}(\Gamma _D)>0\). We assume that the body is clamped on \(\Gamma _D\). The surface traction of density \({\varvec{f}}_2\in (L^2(\Gamma _N))^2\) is applied to \(\Gamma _N\). \(\Gamma _C\) is the contact surface with a rigid foundation. Volume forces of density \({\varvec{f}}_1\in (L^2(\Omega ))^2\) act in \(\Omega \).

The mathematical model to this physical problem can be described as follows (cf. [25]).

$$\begin{aligned} \left\{ \begin{aligned}&-{\mathrm{div}}{\varvec{\sigma }}={\varvec{f}}_1\quad {\mathrm{in}}\ \Omega ,\\&{\varvec{\sigma }}=\mathbb {C}{\varvec{\varepsilon }}\quad {\mathrm{in}}\ \Omega ,\\&{\varvec{u}}={\varvec{0}}\quad {\mathrm{on}}\ \Gamma _D,\\&{\varvec{\sigma {\nu }}}={\varvec{f}}_2\quad {\mathrm{on}}\ \Gamma _N,\\&u_\nu =0\quad {\mathrm{on}}\ \Gamma _C,\\&|{\varvec{\sigma _{\tau }}}|\le g\quad {\mathrm{on}}\ \Gamma _C,\\&|{\varvec{\sigma _{\tau }}}|<g\ \Rightarrow \ {\varvec{u_\tau }}={\varvec{0}}\quad {\mathrm{on}}\ \Gamma _C,\\&|{\varvec{\sigma _{\tau }}}|=g\ \Rightarrow \ {\varvec{u_\tau }}=-\lambda {\varvec{\sigma _{\tau }}}\quad {\mathrm{on}}\ \Gamma _C. \end{aligned} \right. \end{aligned}$$
(3.12)

Here, for a vector \({\varvec{v}}\), denote on the boundary \(\partial \Omega \) by \(v_{\nu }={\varvec{v\cdot \nu }}\) its normal component and \({\varvec{v_{\tau }}}={\varvec{v}}-v_{\nu }{\varvec{v}}\) the tangential component, respectively. Similarly, for a tensor \({\varvec{\sigma }}={\mathbb {C}}{\varvec{\varepsilon (u)}}\), \({\varvec{\sigma }}\in \mathbb {S}^2\), define its normal component as \(\sigma _{\nu }={\varvec{\sigma \nu \cdot \nu }}\) and tangential component as \({\varvec{\sigma _{\tau }}}={\varvec{\sigma \nu }}-\sigma _{\nu }{\varvec{\nu }}\).

Let

$$\begin{aligned} {\varvec{V}}:=\{{\varvec{v}}\in (H^1(\Omega ))^2\mid {\varvec{v}}|_{\Gamma _D}={\varvec{0}},\, v_{\nu }|_{\Gamma _C}=0\}, \end{aligned}$$

equipped with the inner product and the induced norm given by

$$\begin{aligned} ({\varvec{u}},{\varvec{v}})_{{\varvec{V}}}:=\int _{\Omega }{\varvec{\varepsilon }}({\varvec{u}}):{\varvec{\varepsilon }}({\varvec{v}})dx, \quad \Vert {\varvec{v}}\Vert _{{\varvec{V}}}:=({\varvec{v}},{\varvec{v}})_{{\varvec{V}}}^{1/2}. \end{aligned}$$

Since meas(\(\Gamma _D)>0\), Korn’s inequality holds ([35]) so that \(\Vert {\varvec{v}}\Vert _{V}\) is a norm over \({\varvec{V}}\), equivalent to the standard norm \(\Vert {\varvec{v}}\Vert _{1,\Omega }\). Over the space \({\varvec{V}}\), let

$$\begin{aligned} a({\varvec{u}}, {\varvec{v}})&:=\int _{\Omega }\mathbb {C}{\varvec{\varepsilon }}({\varvec{u}}):{\varvec{\varepsilon }}({\varvec{v}})\,dx, \end{aligned}$$
(3.13)
$$\begin{aligned} f({\varvec{v}})&:=\int _{\Omega }{\varvec{f}}_1 \cdot {\varvec{v}}\,dx+\int _{\Gamma _N}{\varvec{f}}_2\cdot {\varvec{v}}\,ds, \end{aligned}$$
(3.14)
$$\begin{aligned} j({\varvec{v}})&:=\int _{\Gamma _C}g\,|{\varvec{v}}_{{\varvec{\tau }}}|\,ds. \end{aligned}$$
(3.15)

Then the variational formulation of (3.12) is to find \({\varvec{u}}\in {\varvec{V}}\) such that

$$\begin{aligned} a({\varvec{u}},{\varvec{v}} - {\varvec{u}})+j({\varvec{v}})-j({\varvec{u}})\ge f ({\varvec{v}}-{\varvec{u}})\quad \forall \,{\varvec{v}} \in {\varvec{V}}. \end{aligned}$$
(3.16)

Next, we choose

$$\begin{aligned} W_h:=\{v\in C(\overline{\Omega })\mid v|_K\in V_1(K)\ \forall \, K \in {\mathcal T}_h\}, \end{aligned}$$

and define

$$\begin{aligned} \varvec{V}_h=(W_h)^2\cap \varvec{V}. \end{aligned}$$

This is the virtual element space we want to use for numerically solving problem (3.16).

Furthermore, we briefly describe how to construct the bilinear form \(a_h^K(\cdot ,\cdot )\). Let \({\varvec{V}}_h(K):=(V_1(K))^2\). Let \({\varvec{\varPi }}_K\) be a projection operator from \({\varvec{V}}_{h}(K)\) into \(\mathbb {P}_{0}(K)^{2\times 2}_{sym}\) such that for any given \({\varvec{v}}_h\in {\varvec{V}}_h(K)\),

$$\begin{aligned} \int _{K}{\varvec{\varPi }}_K({\varvec{v}}_{h}):{\varvec{\varepsilon }}^P dx=\int _{K}{\varvec{\varepsilon }} ({\varvec{v}}_{h}):{\varvec{\varepsilon }}^{P}dx \quad \forall \, {\varvec{\varepsilon }}^{P}\in \mathbb {P}_{0}(K)^{2\times 2}_{sym}, \end{aligned}$$

where \(\mathbb {P}_{0}(K)^{2\times 2}_{sym}\) stands for the set of all second order symmetric tensor fields with each entry being constant. Intuitively, \({\varvec{\varPi }}_K({\varvec{v}}_{h})\) is a constant projection of the strain field \({\varvec{\varepsilon }}({\varvec{v}}_h)\) over K. Then, following [4, Eq. (12)], define

$$\begin{aligned} a_h^{K}({\varvec{v}}_h, {\varvec{w}}_h)=\int _{K}\mathbb {C}{\varvec{\varPi }}_{K}({\varvec{v}}_h):{\varvec{\varPi }}_{K}({\varvec{w}}_h)\,dx+b_h^K({\varvec{v}}_h,{\varvec{w}}_h) \quad \forall \,{\varvec{v}}_h,\, {\varvec{w}}_h \in {\varvec{V}}_h(K),\nonumber \\ \end{aligned}$$
(3.17)

where the second term is a stabilization term. We mention that the first term on the right of (3.17) is essentially equivalent to the first term given in equation (4.1) of the paper [9]. However, the construction of \(b^K_h(\cdot ,\cdot )\) is rather involved, requiring that \( b_h^K(\cdot ,\cdot )\) is a symmetric and positive semidefinite bilinear form whose kernel is exactly \((\mathbb {P}_1(K))^2\) (cf. [9, pp. 808–809]). To simplify the presentation, we refer to [4, 9] for details along this line.

Moreover, similar to the simplified friction problem, we also use the trapezoidal rule to produce its approximation \(j_h(\cdot )\). In other words, for all \({\varvec{v}}\in {\varvec{V}}_h\),

$$\begin{aligned} j_h({\varvec{v}}):=g\sum _{i=1}^{m_0-1}\frac{h_i}{2} (|{\varvec{v}}_{{\varvec{\tau }}}(P_i)|+|{\varvec{v}}_{{\varvec{\tau }}}(P_{i+1})|). \end{aligned}$$
(3.18)

For the right-hand side f, similar to the treatment for the above problem (cf. (3.6)), we define the approximation \(f_h\) such that

$$\begin{aligned} f_h({\varvec{v}} _h):= \sum _{K\in \mathcal T_h}\int _K {\varvec{f}} _1 \cdot {\varvec{\Pi }}_1^\nabla {\varvec{v}}_hdx +\int _{\Gamma _N}{\varvec{f}}_2\cdot {\varvec{v}}_h dx,\quad {\varvec{v}}_h\in {\varvec{V}}_h, \end{aligned}$$
(3.19)

where \( {\varvec{\Pi }}_1^\nabla \) is the vectorized analog of \(\Pi _1^\nabla \), i.e., for all \({\varvec{v}}=(v_1,v_2)^T\),

$$\begin{aligned} {\varvec{\Pi }}_1^\nabla {\varvec{v}}:=(\Pi _1^\nabla v_1, \Pi _1^\nabla v_2)^T. \end{aligned}$$

We remark that the same notation applies to Sobolev spaces as well, e.g., \({\varvec{H}}^2(\Omega ):=(H^2(\Omega ))^2\), etc.

It is easy to see that \(f_h({\varvec{v}}_h)\) is computable because \({\varvec{\Pi }}_1^\nabla \) is computable and \({\varvec{v}}_h\) is piecewise linear function on \(\Gamma _N\). Moreover, using the similar argument for getting (3.7) we know

$$\begin{aligned} \Vert f-f_h\Vert _{{{\varvec{V}}}'_h} \lesssim h. \end{aligned}$$
(3.20)

Now, we are ready to define a VEM for solving problem (3.16) as follows.

$$\begin{aligned} {\varvec{u}}_h \in {\varvec{V}}_h,\quad a_h({\varvec{u}}_h,{\varvec{v}}_h-{\varvec{u}}_h)+j_h({\varvec{v}}_h)-j_h({\varvec{u}}_h) \ge \langle {\varvec{f}}_h,{\varvec{v}}_h-{\varvec{u}}_h\rangle \quad \forall \, {\varvec{v}}_h \in {\varvec{V}}_h. \end{aligned}$$
(3.21)

In the error estimation for the above numerical method, we will assume the solution regularity \({\varvec{u}}\in {\varvec{H}}^2(\Omega )\). Then for \({\varvec{\sigma }}=\mathbb {C}{\varvec{\varepsilon }}\), we have \({\varvec{\sigma }}\in H^1(\Omega )^{2\times 2}\) and it follows that \({\varvec{\sigma }}{\varvec{\nu }}\in {\varvec{L}}^2(\partial \Omega )\). Following the arguments in [32, Section 8.1], we know that the solution \({\varvec{u}}\in {\varvec{V}}\) of (3.16) satisfies the pointwise relations

$$\begin{aligned}&-{\mathrm{div}}{\varvec{\sigma }}={\varvec{f}}_1\quad \mathrm{a.e.\ in}\ \Omega , \end{aligned}$$
(3.22)
$$\begin{aligned}&{\varvec{\sigma {\nu }}}={\varvec{f}}_2\quad \mathrm{a.e.\ on}\ \Gamma _N. \end{aligned}$$
(3.23)

Theorem 3.2

Assume that \({\varvec{f}}\in {\varvec{L}}^2(\Omega )\), and the exact solution \({\varvec{u}}\) of problem (3.16) belong to \({\varvec{H}}^2(\Omega )\) and \(\varvec{u}_{|\Gamma _i}\in \varvec{H}^2(\Gamma _i)\), with \(\{\Gamma _i\}_{i=1}^{n_0}\) being the edges of \(\partial \Omega \) on \(\Gamma _C\). Moreover, the tangential component \(\varvec{u}_{\varvec{\tau }}\) of \({\varvec{u}}\) changes its sign only finite many times on \(\Gamma _C\). Let \(a_h(\cdot ,\cdot )\) be given in [4], \(f_h(\cdot )\) by (3.19) and \(j_h(\cdot )\) by (3.18), respectively. Then

$$\begin{aligned} \Vert {\varvec{u}}-{\varvec{u}}_h\Vert _{{\varvec{V}}}\lesssim h, \end{aligned}$$

where \({\varvec{u}}_h\) is the solution of the method (3.21).

Proof

It is shown in [4, 9, 15, 19] that the above bilinear form \(a_h(\cdot ,\cdot )\) satisfies conditions (2.4) and (2.5). Since the mesh satisfies the Assumption A, Assumption A1 holds by the classical Scott–Dupont theory (cf. [14]). Moreover, there exists a nodal interpolation operator \(I_K\): \(H^2(K)\rightarrow V_1(K)\) such that Assumption A2 holds (cf. [14, 19]). We further write \({\varvec{I}} _K\) as its vectorized analog, and for \({\varvec{v}}\in {\varvec{H}}^2(\Omega )\) write its global interpolant as \({\varvec{v}}_I\). It is easy to check that if \({\varvec{v}}\in {\varvec{V}}\), \({\varvec{v}}_I\in {\varvec{V}}_h\). Thus Assumption A2 also holds. Moreover, it is easy to show that

$$\begin{aligned} j_h({\varvec{v}})\ge j({\varvec{v}})\quad \forall \, {\varvec{v}}\in {\varvec{V}}_h. \end{aligned}$$

Hence, we can bound the error of \({\varvec{u}}_h\) using Theorem 2.2.

Using an argument similar to that in [30] (see also the proof of Theorem 3.1), we have

$$\begin{aligned} |j_h({\varvec{u}}_I)-j({\varvec{u}})|\lesssim h^2. \end{aligned}$$
(3.24)

Using the definitions (3.13) and (3.14), the boundary conditions \({\varvec{u}}_I={\varvec{u}}={\varvec{0}}\) on \(\Gamma _D\), \(u_{I\nu }=u_\nu =0\) on \(\Gamma _C\), and the pointwise relations (3.22)–(3.23), we have by integration by parts that

$$\begin{aligned} a({\varvec{u}},{\varvec{u}}_I-{\varvec{u}})-f({\varvec{u}}_I-{\varvec{u}})=\int _{\Gamma _C}{\varvec{\sigma }}_{{\varvec{\tau }}}\cdot (({\varvec{u}}_{\varvec{\tau }})_I-{\varvec{u}}_{\varvec{\tau }})\,ds. \end{aligned}$$

Hence, by the error estimates for the nodal interpolation operator of Lagrange elements, we immediately know

$$\begin{aligned} |a({\varvec{u}},{\varvec{u}}_I - {\varvec{u}})-f({\varvec{u}}_I - {\varvec{u}})|\lesssim h^2. \end{aligned}$$
(3.25)

Finally, inserting the estimates (3.20), (3.24) and (3.25) into (2.18) and (2.19), we obtain the desired estimate. \(\square \)

4 Numerical Solution of the Discrete Problem

We now discuss efficient algorithms for solving the discrete Problem 2.3, in particular, the problems (3.9) and (3.21). As far as we know, there are two classes of algorithms in the literature for Problem 2.3. The first class of algorithms is based on the primal-dual (or the dual) formulation of Problem 2.3, with the dual problem solved by fixed-point algorithms (cf. [21, 28, 29] and the references therein). It is noted that such algorithms converge only linearly. The second class of algorithms depends on reformulating the original problem as a minimization problem which is solved by feasible optimization methods (cf. [20, 24, 28, 29, 41]). Due to the rapid development of non-smooth convex programming in the past two decades, the latter one seems more attractive. In particular, a regularized semi-smooth Newton method with projection steps is proposed in [41] for solving a class of non-smooth convex problems. The method enjoys a remarkable advantage that it may be superlinearly convergent if a convex problem satisfies some reasonable conditions. Here, we will discuss how to use the method for solving the discrete problems (3.9) and (3.21).

Define

$$\begin{aligned} E(v_h)=\frac{1}{2}a_h(v_h,v_h)-f_h(v_h)+j_h(v_h). \end{aligned}$$

It is well known that Problem 2.3 is equivalent to the following minimization problem.

Find \(u_h\in V_h\) such that

$$\begin{aligned} E(u_h)=\inf \left\{ E(v_h)\mid v_h\in V_h\right\} . \end{aligned}$$

With the above reformulation in mind, after expressing the numerical solution in terms of shape basis functions of a given virtual element space, we can find after a direct manipulation that the minimization problem to either (3.9) or (3.21) can be expressed in the following form:

$$\begin{aligned} \mathop {\min } \limits _{{\varvec{v}} \in \mathbb {R}^{N}} \frac{1}{2}{\varvec{{v}}}^T{\varvec{Av}} - {{\varvec{b}}}^T{{\varvec{v}}}+ {{\varvec{w}}^T}|{\varvec{v}}|, \end{aligned}$$
(4.1)

where \(|{\varvec{v}}|\) denotes a new vector formed by taking the absolute value for each entry of \({\varvec{v}}\), \({\varvec{A}}=(a_{ij})_{N\times N}\), with \({a_{ij}} = a_h(\phi _i,\phi _j)\), is the stiffness matrix, and \({\varvec{b}}=(b_i)_{N\times 1}\), with \(b_i=f_h(\phi _i)\), is the load vector. Here, \(\{\phi _i\}_{i=1}^N\) are the shape basis functions of \(V_h\) or \({\varvec{V}}_h\). Moreover, we number these functions such that the first \(N_1\) ones correspond to the nodes on \(\Gamma _C\). We remark that when dealing with the numerical solution of problem (3.21), we assume each edge of \(\Omega \) on \(\Gamma _C\) is parallel to the coordinate axis.

Next, write the vector \({\varvec{v}}\) in block form as \({\varvec{v}}=(({\varvec{v}}_1)^T,({\varvec{v}}_2)^T)^T\) with \({\varvec{v}}_1\in \mathbb {R}^{N_1}\). Similarly,

$$\begin{aligned} {\varvec{A}}:=\begin{pmatrix} {\varvec{A}}_{11}&{}{\varvec{A}}_{12}\\ {\varvec{A}}_{12}^{T}&{}{\varvec{A}}_{22} \end{pmatrix}, \quad {\varvec{b}}:=\begin{pmatrix} {\varvec{b}}_1\\ {\varvec{b}}_2 \end{pmatrix}, \quad {\varvec{w}}:=\begin{pmatrix} {\varvec{w}}_1\\ {\varvec{0}} \end{pmatrix}, \end{aligned}$$
(4.2)

Then the problem (4.1) can be recast as follows: Find \({\varvec{v}}_1^*\in \mathbb {R}^{N_1}\) and \({\varvec{v}}_2^ *\in \mathbb {R}^{N_2}\) such that

$$\begin{aligned} {\varvec{F}}({\varvec{v}}_1^*,{\varvec{v}}_2^*)=\mathrm{inf}\left\{ {\varvec{F}}({\varvec{v}}_1,{\varvec{v}}_2) \mid {\varvec{v}}_1 \in \mathbb {R}^{N_1}, {\varvec{v}}_2 \in \mathbb {R}^{N_2}\right\} , \end{aligned}$$
(4.3)

where \(N_2=N-N_1\), and

$$\begin{aligned} {\varvec{F}} ({{\varvec{v}}_1},{{\varvec{v}}_2}) = \frac{1}{2}({\varvec{v}}_1^T{\varvec{A}}_{11}{{\varvec{v}}_1} + 2{\varvec{v}}_1^T{{\varvec{A}}_{12}}{{\varvec{v}}_2} + {\varvec{v}}_2^T{{\varvec{A}}_{22}}{{\varvec{v}}_2}) - {\varvec{b}}_1^T{\varvec{v}}_1 -{\varvec{b}}_2^T{{\varvec{v}}_2} + {\varvec{w}}_1^T|{\varvec{v}}_1|. \end{aligned}$$

For the above optimization problem, from the first order condition with respect to \({\varvec{v}}_2\) it follows that

$$\begin{aligned} {\varvec{0}}={\varvec{A}}_{12}^{T}\varvec{v}_1^{*}+{\varvec{A}}_{22}{\varvec{v}}_2^{*}-{\varvec{b}}_2, \end{aligned}$$

which implies

$$\begin{aligned} {\varvec{v}}_2^{*}={\varvec{A}}_{22}^{-1}({\varvec{b}}_2-{\varvec{A}}_{12}^T{\varvec{v}}_1^{*}) . \end{aligned}$$
(4.4)

Inserting this into (4.3), we find the above optimization problem can be expressed as follows.

Find \({\varvec{v}}_1^* \in \mathbb {R}^{N_1}\) such that

$$\begin{aligned} \widetilde{{\varvec{F}}}({\varvec{v}}_1^*)=\inf \left\{ \widetilde{{\varvec{F}}}({\varvec{v}}_1)\mid {\varvec{v}}_1\in \mathbb {R}^{N_1}\right\} , \end{aligned}$$
(4.5)

where

$$\begin{aligned} \widetilde{{\varvec{F}}}({\varvec{v}}_1)=\frac{1}{2}{\varvec{v}}_1^T\widetilde{{\varvec{A}}_1}{{\varvec{v}}_1} -\widetilde{{\varvec{b}}_1}^T{\varvec{v}}_1 +{{\varvec{w}}_1^T}|{\varvec{v}}_1|, \end{aligned}$$

with

$$\begin{aligned} \widetilde{{\varvec{A}}_1} = {\varvec{A}}_{11}-{\varvec{A}}_{12}{\varvec{A}}_{22}^{-1}{\varvec{A}}_{12}^T, \quad \widetilde{{\varvec{b}}_1}= {{\varvec{b}}_1} - {{\varvec{A}}_{12}}{\varvec{A}}_{22}^{-1}{{\varvec{b}}_2}. \end{aligned}$$

The problem (4.5) can be described as the following separable convex optimization problem:

$$\begin{aligned} \min _{{\varvec{x}}} f({\varvec{x}})+g({\varvec{x}}), \end{aligned}$$
(4.6)

where

$$\begin{aligned} f({\varvec{x}}):={\varvec{w}}_1^{T}|{\varvec{x}}|,\quad g({\varvec{x}}):=\frac{1}{2}{\varvec{x}}^T\widetilde{{\varvec{A}}_1}{{\varvec{x}}} -\widetilde{{\varvec{b}}_1}^T{\varvec{x}}. \end{aligned}$$

We next show how to use the algorithm in [41] for working out the solution of (4.6).

First, it is evident that the minimizer of (4.5) satisfies the following fixed-point formulation (cf. [24]):

$$\begin{aligned} {\varvec{x}}={\varvec{T}}({\varvec{x}}):={{{\,\mathrm{\mathbf{prox}}\,}}}_{tf}({\varvec{x}}-t(\widetilde{{\varvec{A}}_1}{{\varvec{x}}}-\widetilde{{\varvec{b}}_1})), \end{aligned}$$

where \(t>0\) is a constant, and \({{\,\mathrm{\mathbf{prox}}\,}}\) is the usual proximal operator (cf. [24]). The proximal operator corresponding to \(f({\varvec{x}})\) is the shrinkage operator defined as

$$\begin{aligned} ({{\,\mathrm{\mathbf{prox}}\,}}_{tf}({\varvec{x}}))_i={\mathrm{sign}}(x_i)\max (|x_i|-w_it,0). \end{aligned}$$

Then we solve the equation

$$\begin{aligned} {\varvec{F}}({\varvec{x}})={\varvec{x}}-{\varvec{T}} ({\varvec{x}})={\varvec{0}} \end{aligned}$$
(4.7)

by using the algorithms in [41]. The generalized Jacobian matrix of \({\varvec{F}}({\varvec{x}})\) is

$$\begin{aligned} {\varvec{J}}({\varvec{x}})={\varvec{I}}-{\varvec{M}}({\varvec{x}})({\varvec{I}}-t\widetilde{{\varvec{A}}_1}), \end{aligned}$$

where \({\varvec{M}}({\varvec{x}})\in \partial {{\,\mathrm{\mathbf{prox}}\,}}_{tf}({\varvec{x}}-t\,(\widetilde{{\varvec{A}}_1}{{\varvec{x}}}-\widetilde{{\varvec{b}}_1}))\) with \(\partial \) denoting Clarke’s generalized Jacobian (cf. [23, 41]), and \(\widetilde{{\varvec{A}}_1}\) is the Hessian matrix of \(g({\varvec{x}})\). Since \(\widetilde{{\varvec{A}}_1}\) is symmetric positive definite, we can choose \({\varvec{M}}({\varvec{x}})\) as a diagonal matrix with the i-th diagonal entry given by

$$\begin{aligned} ( {\varvec{M}}({\varvec{x}}))_{ii}=\left\{ \begin{aligned} 1&\quad \mathrm{if}\ |({\varvec{x}}-t(\widetilde{{\varvec{A}}_1}{{\varvec{x}}}-\widetilde{{\varvec{b}}_1}))_i|> w_i t,\\ 0&\quad \mathrm{otherwise}. \end{aligned} \right. \end{aligned}$$

As in [41], introduce index sets

$$\begin{aligned}&\mathcal {I}:=\{1\le i\le N_{1}\mid |({\varvec{x}}-t(\widetilde{{\varvec{A}}_1}{{\varvec{x}}}-\widetilde{{\varvec{b}}_1}))_i|>t w_i\} =\{1\le i\le N_{1}\mid ({\varvec{M}}({\varvec{x}}))_{ii}=1\},\\&\mathcal {J}:=\{1\le i\le N_{1}\mid |({\varvec{x}}-t(\widetilde{{\varvec{A}}_1}{{\varvec{x}}}-\widetilde{{\varvec{b}}_1}))_i|\le t w_i\} =\{1\le i\le N_{1}\mid ({\varvec{M}}({\varvec{x}}))_{ii}=0\}. \end{aligned}$$

Then partition the matrix \(\widetilde{{\varvec{A}}_1}\) as a 2-by-2 block matrix in terms of the above partition of the index set \(\{1,2,\cdots ,N_1\}\). Then we have

$$\begin{aligned} {\varvec{J}}({\varvec{x}})=\begin{pmatrix} t\,(\widetilde{{\varvec{A}}_1})_{\mathcal {I}\mathcal {I}}&{}t\,(\widetilde{{\varvec{A}}_1})_{\mathcal {I}\mathcal {J}}\\ {\varvec{0}}&{}I \end{pmatrix}. \end{aligned}$$

For an iterate \({\varvec{x}}^k\), by choosing an element \({\varvec{J}}_k\), a regularized Newton method for solving (4.7) is given as follows. Seek the incremental \({\varvec{d}}_k\) by solving the regularized equations

$$\begin{aligned} ({\varvec{J}}_k+\mu _k {\varvec{I}}){\varvec{d}}_k=-{\varvec{F}}^k, \end{aligned}$$

where \({\varvec{F}}^k={\varvec{F}}({\varvec{x}}^k)\), \(\mu _k=\lambda _k\Vert {\varvec{F}}^k\Vert _2\) and \(\lambda _k>0\) is a regularization parameter. For our problem under discussion, the above equations are equivalent to

$$\begin{aligned} \begin{aligned}&{\varvec{d}}_{\mathcal {J}}=-\frac{1}{1+\mu _k} {\varvec{F}}_{\mathcal {J}},\\&\left( t\,(\widetilde{{\varvec{A}}_1})_{\mathcal {I}\mathcal {I}}+\mu _k {\varvec{I}}\right) {\varvec{d}}_{\mathcal {I}} =-{\varvec{F}}_{\mathcal {I}}-t\,(\widetilde{{\varvec{A}}_1})_{\mathcal {I}\mathcal {J}}, \end{aligned} \end{aligned}$$
(4.8)

from which we derive the incremental \({\varvec{d}}_k=({\varvec{d}}_{\mathcal {I}}^T, {\varvec{d}}_\mathcal {J}^T)^T\) and the update \({\varvec{u}}^k={\varvec{x}}^k+{\varvec{d}}_k\). Then \({\varvec{x}}^{k+1}\) is obtained from (3.5)–(3.9) in [41].

Now, we can use the ASSN algorithm ( [41, p. 372]) to solve (4.5). Since \({\varvec{F}}\) is monotone and \({\varvec{T}}\) is 1-averaged, the method is always convergent by Theorem 3.4 and the comment at the end of Section 3.2 in [41]. Moreover, since \(\widetilde{{\varvec{A}}_1}\) is symmetric and positive definite, \({\varvec{F}}\) is semi-smooth and BD-regular, and the method may be superlinearly convergent ( [41, Theorem 3.6]).

Remark 4.1

One main cost of the ASSN algorithm is to solve the linear system in (4.8). Since the coefficient matrix is symmetric and positive definite, we can use the usual conjugate gradient method, which involves the multiplication of \(t\,(\widetilde{{\varvec{A}}_1})_{\mathcal {I}\mathcal {I}}+\mu _k {\varvec{I}}\) by a vector. Hence, we do not need to form the matrix \(\widetilde{{\varvec{A}}_1}\) explicitly, and we only need to solve a linear system with the coefficient matrix \({\varvec{A}}_{22}\).

5 Numerical Experiments

In this section, we report several numerical examples to illustrate the performance of the two methods (3.9) and (3.21). In our numerical simulation, the polygonal meshes are produced by an algorithm discussed in [37] and the codes are written based on the program described in [36]. The meshes with the element numbers \(N=64\) and \(N=256\) for the unit square are displayed in Fig. 2.

Fig. 2
figure 2

The left is a polygon mesh for \(N=64\), and the right one corresponds to \(N=256\)

5.1 Two numerical Examples on the Simplified Friction Problem

Example 1

The related data for the problem (3.2) is \(\Omega =(0,1)\times (0,1)\), \(\mu =1\), \(g=1\), \(\Gamma _C=\{1\}\times [0,1]\). We take the exact solution as

$$\begin{aligned} u(x,y)=(\sin x-x \sin 1)\sin (2\pi y),\quad (x,y) \in \Omega . \end{aligned}$$

The corresponding source term is

$$\begin{aligned} f(x,y) = \left[ (2 + 4{\pi ^2})\,\sin x- (1 + 4{\pi ^2})\,x\,\sin 1\right] \,\sin (2\pi y). \end{aligned}$$

It is easy to check that

$$\begin{aligned} |u-u_h|_{1,\Omega }= & {} \left( \sum _{K\in \mathcal T_h}|u-u_h|^2_{1,K}\right) ^{1/2} \lesssim \left( \sum _{K\in \mathcal T_h}|u-\Pi _1^\nabla u|_{1,K}^2\right) ^{1/2}\\&+ \left( \sum _{K\in \mathcal T_h}|\Pi _1^\nabla u-u_h|_{1,K}^2\right) ^{1/2}. \end{aligned}$$

Moreover, by the norm equivalence in [19], for all \(u\in H^2(\Omega )\), \(|\Pi _1^\nabla u-u_h|_{1,K}\) is equivalent to

$$\begin{aligned} S_K(\Pi _1^\nabla u-u_h) =\left( \sum \limits _{i=1}^{n^K}(\chi _i^K(\Pi _1^\nabla u-u_h))^2\right) ^{1/2}, \end{aligned}$$

where \(\{\chi _i^K(\cdot )\}\) denote the function values of a function at the vertices of K, which is a computational quantity with respect to the degrees of freedom of \(V_h\). Hence, we define a quantity to measure the error of \(u_h\) by

$$\begin{aligned} \Vert u-u_h\Vert _{1*,\Omega }= \left( \sum _{K\in \mathcal T_h}|u-\Pi _1^\nabla u|_{1,K}^2\right) ^{1/2} + \left( \sum _{K\in \mathcal T_h}S_K(\Pi _1^\nabla u-u_h)^2\right) ^{1/2}, \end{aligned}$$
(5.1)

which is again a computable term with respect to the degrees of freedom of \(V_h\). By the above argument, it is clear that

$$\begin{aligned} |u-u_h|_{1,\Omega }\lesssim \Vert u-u_h\Vert _{1*,\Omega }. \end{aligned}$$
(5.2)

Moreover, we define the maximum error of \(u_h\) by

$$\begin{aligned} \Vert u-u_h\Vert _{\infty }=\mathop {\max } \limits _{i}|u(P_i)-u_h(P_i)|, \end{aligned}$$

where \(P_i\) is the i-th node of the mesh \(\mathcal {T}_h\).

We apply the virtual element method (3.9) for discretization, and obtain the discrete solution by solving the related optimization problem using the ASSN algorithm in [41], with the related parameters taken by

$$\begin{aligned} \tau =\eta _1=1/2,\ \nu =\eta _2=3/4,\ \gamma _1=2,\ \gamma _2=3,\ t=1/\lambda _{\mathrm{max}}(\mathrm{eigs}(\widetilde{{\varvec{A}}_1})),\ \alpha =1,\ \beta =1/2,\nonumber \\ \end{aligned}$$
(5.3)

respectively.

Let

$$\begin{aligned} Err:=\Vert u-u_h\Vert _{1*,\Omega },\quad err:=\frac{\Vert u-u_{h}\Vert _{1*,\Omega }}{h}. \end{aligned}$$

Recall that h is the mesh size of a mesh \(\mathcal T_h\). Generally speaking, h is proportional to \(N^{-1/2}\), where N is the total number of elements in the mesh. The errors of the numerical solutions for different meshes in the sense of (5.1) and \(L^{\infty }\)-norm are given in Tables 1 and 2, respectively.

Table 1 The errors for different meshes in the sense of (5.1)
Table 2 \(L^{\infty }\)-norm error for different meshes
Fig. 3
figure 3

The numerical solution related to different numbers of elements: \(N=64\) (upper left), \(N=256\) (upper right), \(N=1024\) (bottom left) and \(N=4096\) (bottom right)

From these numerical results, we may conclude that this virtual element method performs well and there holds \(|u-u_{I}|_{1,\Omega }\lesssim h\), matching with the theoretical result in Theorem 3.1.

Example 2

In this example, the problem is to find \(u\in H^1(\Omega )\) such that

$$\begin{aligned}&\int _\Omega \left[ \nabla u \cdot \nabla (v - u) + u(v - u)\right] dx + g \int _\Gamma |v|\,ds - g\int _\Gamma |u|\,ds \\&\quad \ge \int _\Omega f\,(v - u)\,dx\quad \forall \,v \in H^1(\Omega ). \end{aligned}$$

Here, \(\Omega =(0,1) \times (0,1)\), \(\Gamma _C=\partial \Omega =\Gamma \), \(g = 1\) and \(f = -\Delta w + w\) with

$$\begin{aligned} w(x,y) = \sin (\pi x)\,\sin (\pi y). \end{aligned}$$

Similar to the previous example, we solve the discrete problem by using the ASSN algorithm with the parameters given in (5.3).

The numerical solutions corresponding to several meshes with \(N=64\), \(N=256\), \(N=1024\), \(N=4096\) are displayed in Fig. 3, respectively. A convergence trend is evident for the numerical solutions as N increases.

Since we do not have a closed-form solution for this problem, we show in Fig. 4 the numerical solutions on the boundary \(\{1\}\times [0,1]\) for further numerical evidence of convergence.

Fig. 4
figure 4

Numerical solution on \(\{1\}\times [0,1]\) for several meshes

5.2 An Example of Frictional Contact Problem

Now, we show the performance of the numerical method (3.21) for solving the frictional contact problem (3.12) through an example taken from [13].

Example 3

The domain \(\Omega =(0,4)\times (0,4)\) is the cross section of a three-dimensional linearly elastic body and the plane stress condition is imposed. The boundary \(\partial \Omega \) is decomposed into three parts: \(\Gamma _D= \{4\} \times (0, 4)\) where the body is clamped, \(\Gamma _C= (0,4) \times \{0\}\) where frictional contact takes place, and the remaining part \(\Gamma _N = (\{0\} \times (0, 4))\cup ((0, 4) \times \{4\})\) for traction boundary condition. The elasticity tensor \(\mathbb {C}\) is given by

$$\begin{aligned} ({\mathbb {C}{\varvec{\varepsilon }}})_{ij}=\frac{E\nu }{1-\nu ^2}(\varepsilon _{11}+\varepsilon _{22})\delta _{ij} +\frac{E}{1+\nu }\varepsilon _{ij},\quad 1\le i,j\le 2, \end{aligned}$$

where E is the Young modulus, \(\nu \) is the Poisson ratio of the material and \(\delta _{ij}\) is the Kronecker delta. We use the following data:

$$\begin{aligned}&E=2000\,\mathrm{daN/mm^2},\quad \nu =0.4,\quad g=450\; \mathrm{daN/mm^2},\\&{\varvec{f}}_1=(0,0)^T\,\mathrm{daN/mm^2},\quad {\varvec{f}}_2(x,y)=\left\{ \begin{array}{ll} \left( 200(5-y),-200\right) ^T\,\mathrm{daN/mm^2}\quad \mathrm{on}\ \{0\} \times (0, 4),\\ {\varvec{0}}\quad \mathrm{on}\ (0, 4) \times \{4\}. \end{array} \right. \end{aligned}$$

We apply the ASSN algorithm to solve the discrete problem (3.21) with the parameters specified in (5.3). The numerical solution with several values of N is shown in Fig. 5; a convergence trend is evident as N increases.

Fig. 5
figure 5

The numerical solution for several values of N

In Fig. 6, we plot the numerical solution along tangential direction on the boundary \([0,4]\times \{0\}\); a similar convergence trend is clearly observed.

Fig. 6
figure 6

The numerical solution along tangential direction on \([0,4]\times \{0\}\) for several values of N