1 Introduction

Let \({\varOmega }\) be a bounded polyhedral domain in \(\mathbb {R}^n\) and \({\fancyscript{T}}\) a fixed simplicial triangulation of \({\varOmega }\). That is, \({\fancyscript{T}}\) consists of \(n\)-simplexes, and their union is the closure of \({\varOmega }\). Furthermore, the intersection of any two simplexes is either empty or a common subsimplex of each. The purpose of this paper is to construct a decomposition of scalar functions on \({\varOmega }\) into a sum of functions with local support with respect to the triangulation \({\fancyscript{T}}\). The decomposition is defined by a linear map \({\fancyscript{B}}= {\fancyscript{B}}_{{\fancyscript{T}}}\), referred to as the bubble transform, which maps the Sobolev space \(H^1({\varOmega })\) boundedly into a sum of local spaces of the form \(\mathring{H}^1({\varOmega }_f)\), where \(f\) runs over all the subsimplexes of \({\fancyscript{T}}\) and \({\varOmega }_f\) denotes appropriate macroelements associated with \(f\). Here, the space \(\mathring{H}^1({\varOmega }_f)\) consists of all functions in \(H^1({\varOmega }_f)\), which are zero on the part of the boundary of \({\varOmega }_f\), which is in the interior of \({\varOmega }\). The map \({\fancyscript{B}}\) is composed of local maps \(B_f\) such that any \(u \in H^1({\varOmega })\) admits the decomposition

$$\begin{aligned} u = \sum _{f} B_f u. \end{aligned}$$

The maps \(B_f : H^1({\varOmega }) \rightarrow \mathring{H}^1({\varOmega }_f)\) are local and bounded linear maps with the property that for all values of \(r\ge 1\), if \(u\) is a continuous piecewise polynomial of degree at most \(r\) with respect to the triangulation \({\fancyscript{T}}\), then \(B_f u\) is a continuous piecewise polynomial of degree at most \(r\) with respect to the restriction of the triangulation to \({\varOmega }_f\). Thus, the map \({\fancyscript{B}}\) is independent of a particular polynomial degree \(r\) and so does not depend on a particular finite element space.

To motivate the construction of the bubble transform, let us recall that the construction of projection operators is a key tool for deriving stability results and convergence estimates for various finite element methods. In particular, for the analysis of mixed finite element methods, projection operators which commute with differential operators have been a central feature since the beginning of such analysis  cf. [7, 8]. Another setting where such operators potentially would be very useful, but hard to construct, is the analysis of the so-called \(p\)-version of the finite element method, i.e., in the setting where we are interested in convergence properties as the polynomial degree of the finite element spaces increases. For such investigations, the construction of projection operators which admit uniform bounds with respect to polynomial degree represents a main challenge. In fact, so far, such constructions have appeared to be substantially more difficult than the more standard analysis of the finite element method, where the focus is on convergence with respect to mesh refinement.

Pioneering results on the convergence of the \(p\)-method applied to second-order elliptic problems in two space dimensions were derived by Babuška and Suri [4]. An important ingredient in their analysis was the construction of a polynomial preserving extension operator. A generalization of the construction to three space dimensions in the tetrahedral case can be found in [20], while \(hp\)-stable quasi-interpolation in the case of low regularity is studied in [19]. The importance of polynomial preserving extension operators for the Maxwell equations was argued in [10]. Further developments of commuting extension operators for the de Rham complex in three space dimensions are for example presented in [1114]. These constructions have been used to establish a number of convergence results for the \(p\)-method, not only for boundary value problems, but also for eigenvalue problems [6]. A crucial step in this analysis is the use of so-called projection-based interpolation operators, cf. [5, Chapter3] and [10, 11, 17]. However, this development has not led to local projection operators which are uniformly bounded in the appropriate Sobolev norms. Some extra regularity seems to be necessary, cf. [6, Section 6] or [17, Section 4], and as a consequence, the theory for the \(p\)-method is far more technical than the corresponding theory for the \(h\)-method. This complexity represents a main obstacle for generalizing the theory for the \(p\)-method in various directions. The bubble transform introduced in this paper represents a new tool, which will be useful to overcome some of these difficulties. In particular, the construction of projection operators onto the spaces of continuous piecewise polynomials, which are uniformly bounded in \(H^1\) with respect to the polynomial degree, is an immediate consequence.

In practical computations, improved accuracy is often achieved by combining increased polynomial degree and mesh refinement, an approach frequently referred to as the \(hp\)-finite element method. However, throughout this paper, we consider the triangulation \({\fancyscript{T}}\) to be fixed. We let \({\varDelta }_j({\fancyscript{T}})\) denote the set of subsimplexes of dimension \(j\) of the triangulation \({\fancyscript{T}}\), while

$$\begin{aligned} {\varDelta }({\fancyscript{T}}) = \bigcup _{j=0}^n {\varDelta }_j({\fancyscript{T}}) \end{aligned}$$

is the set of all subsimplexes. Correspondingly, if \(f \in {\varDelta }({\fancyscript{T}})\), then \({\varDelta }(f)\) denotes the set of subsimplexes of \(f\). We denote by \(W_r({\fancyscript{T}}) \subset H^1({\varOmega })\) the space of continuous piecewise polynomials of degree \(r\) with respect to the triangulation \({\fancyscript{T}}\) and recall that the spaces \(W_r({\fancyscript{T}})\) admit degrees of freedom of the form

$$\begin{aligned} \int _f u \, \eta , \quad \eta \in {\fancyscript{P}}_{r-1-\dim f}(f),\, f \in {\varDelta }({\fancyscript{T}}), \end{aligned}$$
(1.1)

where \({\fancyscript{P}}_j(f)\) denotes the set of polynomials of degree \(j\) on \(f\). These degrees of freedom uniquely determine an element in \(W_r({\fancyscript{T}})\). In fact, the degrees of freedom associated with a given simplex \(f \in {\varDelta }({\fancyscript{T}})\) uniquely determine elements in \(\mathring{\fancyscript{P}}_r(f)\), the space of polynomials of degree \(r\) on \(f\) which vanish on the boundary \(\partial f\).

For each \(f \in {\varDelta }({\fancyscript{T}})\), we let \({\varOmega }_f\) be the macroelement consisting of the union of the elements of \({\fancyscript{T}}\) containing \(f\), i.e.,

$$\begin{aligned} {\varOmega }_f = \bigcup \{T \, | \, T \in {\fancyscript{T}}, \, f \in {\varDelta }(T) \, \}, \end{aligned}$$

while \({\fancyscript{T}}_f\) is the restriction of the triangulation \({\fancyscript{T}}\) to \({\varOmega }_f\). Two such macroelements in the case of two space dimensions are illustrated below in Fig. 1 .

Fig. 1
figure 1

a Vertex macroelement. b Edge macroelement

It is a consequence of the properties of the degrees of freedom that for each \(f \in {\varDelta }({\fancyscript{T}})\), there exists an extension operator \(E_f : \mathring{\fancyscript{P}}_r(f) \rightarrow \mathring{W}_r({\fancyscript{T}}_f)\). Here, \(\mathring{W}_r({\fancyscript{T}}_f)\) consists of all functions in \(W_r({\fancyscript{T}}_f)\) which are identically zero on \({\varOmega }\setminus {\varOmega }_f\). Furthermore, the space \(W_r({\fancyscript{T}})\) can be represented by a direct sum,

$$\begin{aligned} W_r({\fancyscript{T}}) = \bigoplus _{f \in {\varDelta }({\fancyscript{T}})}E_f(\mathring{\fancyscript{P}}_r(f)). \end{aligned}$$
(1.2)

Here, the symbol \(\bigoplus \) has the interpretation of internal direct sum. However, in the rest of this paper, we will find it convenient to use this symbol to denote the external direct sum, which can be identified with the direct product. As a consequence,

$$\begin{aligned} \bigoplus _{f \in {\varDelta }({\fancyscript{T}})}E_f(\mathring{\fancyscript{P}}_r(f)) \subset \bigoplus _{f \in {\varDelta }({\fancyscript{T}})} \mathring{W}_r({\fancyscript{T}}_f) \subset \bigoplus _{f \in {\varDelta }({\fancyscript{T}})} \mathring{H}^1({\varOmega }_f). \end{aligned}$$

The extension operators \(E_f\) introduced above, defined from the degrees of freedom, will depend on the space \(W_r({\fancyscript{T}})\). In particular, they depend on the polynomial degree \(r\). However, it is a key observation that the macroelements \({\varOmega }_f\) only depend on the triangulation \({\fancyscript{T}}\), and not on \(r\). So for all \(r\), there exists a decomposition of the space \(W_r({\fancyscript{T}})\) of the form (1.2), i.e., into a sum of subspaces of \(\mathring{W}_r({\fancyscript{T}}_f)\). Furthermore, the geometric structure of these decompositions, represented by the simplexes \(f \in {\varDelta }({\fancyscript{T}})\) and the associated macroelements \({\varOmega }_f\), is independent of \(r\), and this indicates that a corresponding decomposition may also exist for the space \(H^1({\varOmega })\) itself. More precisely, the ansatz is a decomposition of any \(u \in H^1({\varOmega })\) of the form \(u = \sum _f u_f\), where \(u_f \in \mathring{H}^1({\varOmega }_f)\). The bubble transform, \({\fancyscript{B}}= {\fancyscript{B}}_{{\fancyscript{T}}}\), which we will introduce below, produces such a decomposition. As noted above, the transform is a bounded linear operator

$$\begin{aligned} {\fancyscript{B}}: H^1({\varOmega }) \rightarrow \bigoplus _{f \in {\varDelta }({\fancyscript{T}})} \mathring{H}^1({\varOmega }_f) \end{aligned}$$

that preserves the piecewise polynomial spaces in the sense that if \(u \in W_r({\fancyscript{T}})\), then each component of the transform, \(u_f= B_fu\), is in \(\mathring{W}_r({\fancyscript{T}}_f) \subset \mathring{H}^1({\varOmega }_f)\). In fact, \({\fancyscript{B}}\) is also bounded in \(L^2\). The transform depends on the given triangulation \({\fancyscript{T}}\), but there is no finite element space present in the construction.

We should note that once the transformation \({\fancyscript{B}}\) is shown to exist, the construction of local and uniformly bounded projections onto the spaces \(W_r({\fancyscript{T}})\), with a bound independent of \(r\), is straightforward. We just project each component \(B_fu \in \mathring{H}^1({\varOmega }_f)\) by a local projection into the subspace \(\mathring{W}_r({\fancyscript{T}}_f)\). Since each local projection can be chosen to have norm equal to one, the global operator mapping \(u\) to the local projections of \(B_fu\) will be bounded independently of the degree \(r\). Furthermore, this process will lead to a projection operator since the transform preserves continuous piecewise polynomials. In fact, there are similarities between the construction of projections just outlined, and the quasi-interpolation studied in [19], since both operators are constructed from components with local support. However, for the construction presented below, the local components \(B_fu\), produced by the bubble transform at “the continuous level,” are a key ingredient. In contrast, for the construction given in [19], the local components are computed directly from local projections of the function \(u\) into the given finite element space, in the spirit of the Clément operator, and these local projections depend on the polynomial degree. Therefore, in this case, \(p\)-stability can only be obtained by tracking the dependence on the polynomial degree.

In fact, unisolvent degrees of freedom, generalizing (1.1), exist for all the finite element spaces of differential forms, referred to as \({\fancyscript{P}}_r{\varLambda }^k({\fancyscript{T}})\) and \({\fancyscript{P}}_r^-{\varLambda }^k({\fancyscript{T}})\) and studied in [1, 3]. As long as the triangulation \({\fancyscript{T}}\) is fixed, all these spaces admit degrees of freedom with a common geometric structure, independent of the polynomial degree \(r\). Therefore, for all these spaces, there exist degrees of freedom generalizing (1.1) and local decompositions similar to (1.2). So far, these decompositions have been utilized to derive basis functions in the general setting, cf. [2], and to construct canonical, but unbounded, local projections [1, Section 5.2]. By combining these canonical projections with appropriate smoothing operators, bounded, but nonlocal projections which commute with the exterior derivative were also constructed in [9, 22] and [1, Section 5.4]. Furthermore, in [16] and [15], local decompositions and a double complex structure were the main tools to obtain local and bounded cochain projections for the spaces \({\fancyscript{P}}_r{\varLambda }^k({\fancyscript{T}})\) and \({\fancyscript{P}}_r^-{\varLambda }^k({\fancyscript{T}})\). However, none of the projections just described will admit bounds which are independent of the polynomial degree \(r\), while the construction of projections with such bounds is almost immediate from the properties of the bubble transform, cf. Sect. 4.3 below. Therefore, it is our ambition to generalize the construction of the bubble transform given below to differential forms in any dimension, such that the transform is bounded in the appropriate Sobolev norms, it commutes with the exterior derivative, and it preserves the finite element spaces \({\fancyscript{P}}_r{\varLambda }^k({\fancyscript{T}})\) and \({\fancyscript{P}}_r^-{\varLambda }^k({\fancyscript{T}})\). However, in the rest of this paper, we restrict the discussion to \(0\)-forms, i.e., to ordinary scalar valued functions defined on \({\varOmega }\subset \mathbb {R}^n\) and use the simpler notation \(W_r({\fancyscript{T}})\) rather than \({\fancyscript{P}}_r{\varLambda }^0({\fancyscript{T}}) = {\fancyscript{P}}_r^-{\varLambda }^0({\fancyscript{T}})\) to denote the piecewise polynomial space of degree \(\le r\) on \({\fancyscript{T}}\).

The present paper is organized as follows. In Sect. 2, we present the main properties of the transform and introduce some useful notation. The key tools needed for the construction are introduced in Sect. 3. In particular, for any \(f \in {\varDelta }({\fancyscript{T}})\), we introduce a local average operator, \(A_f\), which is used to obtain local approximations near \(f\). For any \(u \in L^2({\varOmega })\), the functions \(A_fu\) are smooth away from \(f\), and a Hardy-type inequality, cf. [21], is used to characterize the error of the approximation (cf. Lemma 3.4). The main results of the paper are derived in Sect. 4, where the Hardy-type estimates are used as a fundamental tool to show that the components \(B_fu\) are elements of \(\mathring{H}^1({\varOmega }_f)\) (cf. Lemma 4.3). However, the verification of some of the more technical estimates is delayed until Sect. 5.

2 Preliminaries

We will use \(H^1({\varOmega })\) to denote the Sobolev space of all functions in \(L^2({\varOmega })\), which also have the components of the gradient in \(L^2\), and \(\Vert \cdot \Vert _1\) is the corresponding norm. If \({\varOmega }' \subset {\varOmega }\), then \(\Vert \cdot \Vert _{1,{\varOmega }'}\) denotes the \(H^1\) norm with respect to \({\varOmega }'\). The corresponding notation for the \(L^2\)-norms is \(\Vert \cdot \Vert _0\) and \(\Vert \cdot \Vert _{0,{\varOmega }'}\). Furthermore, if \({\varOmega }_f\) is a macroelement associated with \(f \in {\varDelta }({\fancyscript{T}})\), then

$$\begin{aligned} \mathring{H}^1({\varOmega }_f) = \{ v \in H^1({\varOmega }_f)\, | \, \mathring{E}_fv \in H^1({\varOmega }) \, \}, \end{aligned}$$

where \(\mathring{E}_f : L^2({\varOmega }_f) \rightarrow L^2({\varOmega })\) denotes the extension by zero outside \({\varOmega }_f\). In addition to the macroelements \({\varOmega }_f\), we also introduce the extended macroelements, \({\varOmega }_f^e\), given by \({\varOmega }_f^e = \cup \{{\varOmega }_g \, | \, g \in {\varDelta }_0({\fancyscript{T}}) \, \}\) (Fig. 2) .

Fig. 2
figure 2

The extended macroelement \({\varOmega }_f^e\) for \(f=[{y}_0,{y}_1]\) and \(n=2\)

It is a simple observation that if \(g \in {\varDelta }(f)\), then \({\varOmega }_g \supset {\varOmega }_f\), while \({\varOmega }_g^e \subset {\varOmega }_f^e\).

2.1 An Overview of the Construction

The construction of the transformation \({\fancyscript{B}}\) will be done inductively with respect to the dimension of \(f \in {\varDelta }({\fancyscript{T}})\). We are seeking a decomposition of the space \(H^1({\varOmega })\) with properties similar to (1.2). More precisely, we will establish that any function \(u \in H^1({\varOmega })\) can be decomposed into a sum, \(u = \sum _f u_f\), where each component \(u_f \in \mathring{H}^1({\varOmega }_f)\). The map \(u \mapsto u_f\) will be denoted \(B_f\), and the collection of all these maps can be seen as a linear transformation \({\fancyscript{B}}= {\fancyscript{B}}_{{\fancyscript{T}}} : H^1({\varOmega }) \rightarrow \bigoplus _{f \in {\varDelta }({\fancyscript{T}})}\mathring{H}^1({\varOmega }_f)\) with the following properties:

  1. (i)

    \(u = \sum _f B_f u\), where the component map \(B_f\) is a local operator mapping \(H^1({\varOmega }_f^e)\) to \(\mathring{H}^1({\varOmega }_f)\).

  2. (ii)

    \({\fancyscript{B}}\) is bounded, i.e., there is a constant \(c\), depending on the triangulation \({\fancyscript{T}}\), such that

    $$\begin{aligned} \sum _f \Vert B_f u \Vert _{1,{\varOmega }_f}^2 \le c \Vert u \Vert _1^2, \quad u \in H^1({\varOmega }). \end{aligned}$$
  3. (iii)

    \({\fancyscript{B}}\) preserves the piecewise polynomial spaces in the sense that

    $$\begin{aligned} u \in W_r({\fancyscript{T}}) \Longrightarrow B_fu \in \mathring{W}_r({\fancyscript{T}}_f). \end{aligned}$$

In the special case when \(n=1\) and \({\varOmega }\) is an interval, say \({\varOmega }= (0,1)\), a transform with the above properties is easy to construct. In this case, \({\fancyscript{T}}\) is simply a partition of the form

$$\begin{aligned} 0= x_0 < x_1< \cdots <x_N = 1. \end{aligned}$$

The set \({\varDelta }_0({\fancyscript{T}})\) is the set of vertices \(\{x_j\}\), while \({\varDelta }_1({\fancyscript{T}})\) is the set of intervals of the form \((x_{j-1},x_j)\). If \(f = x_j \in {\varDelta }_0({\fancyscript{T}})\), then \({\varOmega }_f = (x_{j-1},x_{j+1})\), with an obvious modification near the boundary, while \({\varOmega }_f = f\) for \(f \in {\varDelta }_1({\fancyscript{T}})\). Let \(\lambda _i \in W_1({\fancyscript{T}})\) be the standard piecewise linear “hat functions,” characterized by \(\lambda _i(x_j) = \delta _{i,j}\). For all \(f = x_j\in {\varDelta }_0({\fancyscript{T}})\), we let \(B_f u = u(x_j)\lambda _j\). By construction, \(B_f u \) has support in \({\varOmega }_f\). Furthermore, the function

$$\begin{aligned} u^1 = u - \sum _{f \in {\varDelta }_0({\fancyscript{T}})} B_fu \end{aligned}$$

vanishes at all the vertices \(x_j\). If \(f =(x_{j-1},x_j) \in {\varDelta }_1({\fancyscript{T}})\) then \({\varOmega }_f = f\). Therefore, if for all \(f \in {\varDelta }_1({\fancyscript{T}})\), we let \(B_f u = u^1|_f\) when \(x \in f\) and zero otherwise, then \(B_f u \in \mathring{H}^1({\varOmega }_f)\), and \(u = \sum _{f \in {\varDelta }({\fancyscript{T}})} B_fu\). In fact, it is straightforward to check that all the properties (i)–(iii) hold for this construction.

In general, for \(n>1\), the restriction of \(u\) to a simplex \(f \in {\varDelta }({\fancyscript{T}})\), denoted \({\text {tr}}_f u\), may not be well defined for \(u \in H^1({\varOmega })\). Therefore, the simple construction above cannot be directly generalized to higher dimensions. For example, when \(f\) is the vertex \({x}_0\), to define \(B_f u\), we introduce the \(\lambda _0\)-weighted average of \(u\) given by

$$\begin{aligned} U({x}) = \frac{1}{|{\varOmega }_f|} \int _{{\varOmega }_f} u(\lambda _0({x}) {x}_0 + [1 - \lambda _0({x})] {y}) \, \hbox {d}{y}, \end{aligned}$$

where \(\lambda _0({x})\) is now the \(n\)-dimensional piecewise linear function equal to one at \({x}_0\) and zero at all other vertices. Note that if \(u\) is well defined at \({x}_0\), then \(U({x}_0) = u({x}_0)\), while if \({x}\in {\varOmega }\setminus {\varOmega }_f\), then \(U({x})\) is just the average of \(u\) over \({\varOmega }_f\). In general, for \({x}\ne {x}_0\), \(U({x})\) has pointwise values. Note that \(U({x})\) depends only on \(\lambda _0({x})\), so is constant on level sets of \(\lambda _0({x})\).

In fact, if we replace \(\lambda _0({x})\) by a variable \(\lambda \) taking values in \([0,1]\) in the definition of \(U({x})\) above, then we may view \(U\) as a function of \(\lambda \), which we will call \((A_f u)(\lambda )\). Hence, \((A_f u)(\lambda _0({x})) = U({x})\). It is easy to check that if \(u\) is a piecewise polynomial in \({x}\), then \(A_f u\) is a polynomial in \(\lambda \). Finally, if we define

$$\begin{aligned} (B_f u)({x}) = (A_f u)(\lambda _0({x})) - [1 - \lambda _0({x})] (A_f u)(0), \end{aligned}$$
(2.1)

then \(B_f u\) will have support on \({\varOmega }_f\). The averaging operator \(A_f\) just introduced is closely related to a corresponding operator introduced in [23], where it is referred to as the “spider-averaging operator.” However, a difference is that the operator in [23] is defined from averages with respect to level curves, while the present operator uses averages with respect to the domain bounded by the level curves (Fig. 3).

Fig. 3
figure 3

The level set \(\lambda _0({x})=1/4\) in the macroelement \({\varOmega }_{x_0}\)

For simplexes \(f\) of higher dimension, the operators \(B_f\) will be constructed recursively by a process of the form

$$\begin{aligned} B_fu = C_f\left( u - \sum _{\mathop {\dim g < \dim f}\limits ^{g \in {\varDelta }({\fancyscript{T}})}} B_gu\right) , \end{aligned}$$

where \(C_f\) is a local trace preserving cutoff operator, i.e., designed such that \(C_fv\) is close to \(v\) near \(f\), but at the same time \(C_fv\) vanishes outside \({\varOmega }_f\). To also have \(C_fv\) in \(H^1\) will in general require compatibility conditions of \(v\) on \(\partial f \subset \partial {\varOmega }_f\). We will return to the precise definition of the operators \(B_f\) and \(C_f\) in Sect. 4 below.

2.2 Barycentric Coordinates

If \({x}_j \in {\varDelta }_0({\fancyscript{T}})\) is a vertex, then \(\lambda _j({x}) \in {\fancyscript{P}}_1({\fancyscript{T}})\) is the corresponding barycentric coordinate, extended by zero outside the corresponding macroelement. If \(f \in {\varDelta }_m({\fancyscript{T}})\) has vertices \({x}_0, {x}_1, \ldots , {x}_m\), then we write \([{x}_0,{x}_1, \ldots ,{x}_m]\) to denote convex combinations, i.e.,

$$\begin{aligned} f= [{x}_0,{x}_1, \ldots ,{x}_m] = \left\{ \, {x}= \sum _{j=0}^m \alpha _j{x}_j \, | \, \sum _j \alpha _j =1, \, \alpha _j \ge 0 \, \right\} . \end{aligned}$$

The corresponding vector field \((\lambda _0, \lambda _1, \ldots , \lambda _m)\) with values in \(\mathbb {R}^{m+1}\) is denoted \({\lambda }_f\). Hence, the map \({x}\mapsto {\lambda }_f({x})\), restricted to \(f\), is a one-one map of \(f\) onto \({\fancyscript{S}}_m\), where

$$\begin{aligned} {\fancyscript{S}}_m = \left\{ \, {\lambda }= (\lambda _0, \ldots ,\lambda _m) \in \mathbb {R}^{m+1} \, | \, \sum _{j=0}^m \lambda _j = 1, \, \lambda _j \ge 0 \, \right\} . \end{aligned}$$

To the simplex \({\fancyscript{S}}_m\), we associate the simplex \({\fancyscript{S}}_m^c = [{\fancyscript{S}}_m,0]\), given by

$$\begin{aligned} {\fancyscript{S}}_m^c = \left\{ \, {\lambda }= (\lambda _0, \ldots ,\lambda _m) \in \mathbb {R}^{m+1} \, | \, \sum _{j=0}^m \lambda _j \le 1, \, \lambda _j \ge 0 \, \right\} . \end{aligned}$$

Hence, \({\fancyscript{S}}_m\) is an \(m\) dimensional subsimplex of \({\fancyscript{S}}_m^c\). For \({\lambda }\in {\fancyscript{S}}_m^c\), we define

$$\begin{aligned} b({\lambda }) = b_m({\lambda }) = 1 - \sum _{j=0}^m \lambda _j, \end{aligned}$$

i.e., corresponding to the barycentric coordinate of the origin.

If \(f = [{x}_0, {x}_1, \ldots ,{x}_m] \in {\varDelta }_m({\fancyscript{T}})\), then the macroelements \({\varOmega }_f\) and \({\varOmega }_f^e\) are given by

$$\begin{aligned} {\varOmega }_f = \bigcap _{j= 0}^{m} {\varOmega }_{x_j} \quad \text {and } {\varOmega }_f^e = \bigcup _{j= 0}^{m} {\varOmega }_{x_j}. \end{aligned}$$

The map \({x}\mapsto {\lambda }_f({x})\) maps \({\varOmega }\) to \({\fancyscript{S}}_m^c\), \(f\) to \({\fancyscript{S}}_m\), and the boundary \(\partial {\varOmega }_f\) to \(\partial {\fancyscript{S}}_m^c \setminus {\fancyscript{S}}_m\), cf. Fig. 4. In particular, \({\varOmega }\setminus {\varOmega }_f^e\) is mapped to the origin.

Fig. 4
figure 4

The map \({x}\mapsto {\lambda }_f({x})\) for \(n=2\) and \(m=1\)

For each \(f = [{x}_0, {x}_1, \ldots ,{x}_m] \in {\varDelta }_m({\fancyscript{T}})\), we also introduce the piecewise linear function \(\rho _f\) on \({\varOmega }\) by

$$\begin{aligned} \rho _f({x})= 1 - \sum _{j=0}^m \lambda _j({x}) = b({\lambda }_f({x})). \end{aligned}$$

As a consequence, the simplex \(f\) can be characterized as the null set of \(\rho _f\), while \(\rho _f \equiv 1\) on \({\varOmega }\setminus {\varOmega }_f^e\).

For each integer \(m \ge 0\), we let \({\fancyscript{I}}_m\) be the set of all subindexes of \((0,1,\ldots ,m)\), i.e., \({\fancyscript{I}}_m\) corresponds to all subsets of \(\{0,1,\ldots ,m \}\), where the ordering of the elements is disregarded. In particular, we count the empty set as an element of \({\fancyscript{I}}_m\), such that \({\fancyscript{I}}_m\) is a finite set with \(2^{m+1}\) elements. We will use \(|I|\) to denote the cardinality of \(I\). If \( 0 \le i \le m\) is an integer, then there are exactly \(2^{m}\) elements of \({\fancyscript{I}}_m\) which contain \(i\), and \(2^{m}\) elements which do not contain \(i\). For any \(I \in {\fancyscript{I}}_m\), we define \({P}_I : {\fancyscript{S}}_m^c \rightarrow {\fancyscript{S}}_m^c\) by

$$\begin{aligned} ({P}_I{\lambda })_i = \left\{ \begin{array}{ll} 0, \quad i \in I,\\ \lambda _i, \quad i \notin I. \end{array} \right. \end{aligned}$$

Hence, if \(I\) is nonempty, then \({P}_I\) maps the simplex \({\fancyscript{S}}_m^c\) to a portion of its boundary. In particular, if \(I =\{0,1,\ldots ,m \}\), then \({P}_I\) maps \({\fancyscript{S}}_m^c\) into the origin of \(\mathbb {R}^{m+1}\), while \({P}_I\) is the identity if \(I\) is the empty set. Finally, for any \(f \in {\varDelta }_m({\fancyscript{T}})\) and \(I \in {\fancyscript{I}}_m\) we let \(f(I) \in {\varDelta }(f)\) denote the corresponding subsimplex of \(f\) given by \(f(I) = \{ {x}\in f \, | \, {P}_I{\lambda }_f({x}) = {\lambda }_f({x}) \, \}.\) Hence, if \(I\) is the empty set, then \(f(I) = f\), while \(f(I)\) is the empty subsimplex of \(f\) if \(I = (0,1, \ldots ,m) \in {\fancyscript{I}}_m\).

3 Tools for the Construction

The key tools for the construction are two families of operators, referred to as trace preserving cutoff operators and local averaging operators.

3.1 The Trace Preserving Cutoff Operator on \({\fancyscript{S}}_m^c\)

Let \(w\) be a real-valued function defined on \({\fancyscript{S}}_m^c\). For the discussion in this section, we will assume that \(w\) is sufficiently regular to justify the operations below in a pointwise sense. We will introduce an operator \(K= K_m\), which maps such functions \(w\) into a new function on \({\fancyscript{S}}_m^c\), with the property that the trace on \({\fancyscript{S}}_m\) is preserved, but such that the trace of \(K_mw\) vanishes on the rest of the boundary of \({\fancyscript{S}}_m^c\). In fact, the operator \(K_m\) strongly resembles the extension operators discussed in [12], where the construction utilizes correction terms associated with the various subsimplexes of \({\fancyscript{S}}_m^c\). However, in the present setting, where we will be working with functions which may not have a trace on \({\fancyscript{S}}_m\), trace preserving operators seem to be a more useful concept. The operator \(K_m\) can be viewed as a sum of pullbacks, weighted by rational coefficients. However, the operator \(K_m\) preserves polynomials in an appropriate sense, cf. Lemma 3.1 below. The operator \(K_m\) is defined by

$$\begin{aligned} K_mw({\lambda }) = \sum _{I \in {\fancyscript{I}}_m}(-1)^{|I|}K_m^Iw = \sum _{I \in {\fancyscript{I}}_m}(-1)^{|I|}\frac{b({\lambda })}{b({P}_I{\lambda })}w({P}_I{\lambda }), \quad {\lambda }\in {\fancyscript{S}}_m^c. \end{aligned}$$

When \(m=0\), the set \({\fancyscript{I}}_0\) has only two elements, the empty set and \((0)\). Therefore, the function \(K_0\) maps functions \(w= w(\lambda )\), defined on \({\fancyscript{S}}_0^c = [0,1]\), to

$$\begin{aligned} K_0w(\lambda ) = w(\lambda ) - (1- \lambda )w(0), \end{aligned}$$

such that (2.1) can be rewritten as \(B_fu = (K_0 \circ A_f)u(\lambda _0(\cdot ))\). We observe that \(K_0w(1) = w(1)\), \(K_0w(0) = 0\), and if \(w \in {\fancyscript{P}}_r\) then \(K_0w \in {\fancyscript{P}}_r\). Formally, we can also argue that \({\text {tr}}_{{\fancyscript{S}}_m}(w -K_mw) = 0\) for \(m\) greater than zero. This easily follows since all the terms in the sum defining \(K_m\), except for the one corresponding to \(I = \emptyset \), i.e., \(I\) is the empty set, have vanishing trace on \({\fancyscript{S}}_m\) due to the appearance of the term \(b({\lambda })\) in the numerator. A corresponding argument also shows that the trace of \(K_mw\) vanishes on the rest of the boundary of \({\fancyscript{S}}_m^c\). Recall that the boundary of \({\fancyscript{S}}_m^c\) consists of \({\fancyscript{S}}_m\) and the subsimplexes

$$\begin{aligned} {\fancyscript{S}}_{m,i}= \{ {\lambda }\in {\fancyscript{S}}_m^c \, |\, \lambda _i = 0 \, \} \quad i=0,1, \ldots ,m. \end{aligned}$$

Furthermore, for a fixed \(i\), let \(I \in {\fancyscript{I}}_m\) be any index such that \(i \notin I\), and let \(I' \in {\fancyscript{I}}_m\) be given as \(I'= I \cup \{i \}\). For \({\lambda }\in {\fancyscript{S}}_{m,i}\), we have \({P}_{I'}{\lambda }= {P}_I{\lambda }\), and therefore,

$$\begin{aligned} K_m^Iw({\lambda }) - K_m^{I'}w({\lambda }) = \frac{b({\lambda })}{b({P}_I{\lambda })}w({P}_I{\lambda }) - \frac{b({\lambda }) }{b({P}_{I'}{\lambda })}w({P}_{I'}{\lambda })= 0. \end{aligned}$$

However, for a fixed \(i\), the set \({\fancyscript{I}}_m\) is exactly equal to the union of indexes of the form \(I\) and \(I'\). As a consequence, we conclude that \(K_mw\) is identically zero on \({\fancyscript{S}}_{m,i}\) and hence on \(\partial {\fancyscript{S}}_m^c \setminus {\fancyscript{S}}_m\). In particular, \(K_mw\) is zero at the origin.

The operator \(K_m\) preserves polynomials in the following sense.

Lemma 3.1

Assume that \(w \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\) with \({\text {tr}}_{{\fancyscript{S}}_m} w \in \mathring{\fancyscript{P}}_r({\fancyscript{S}}_m)\). Then \(K_m w \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\), \({\text {tr}}_{{\fancyscript{S}}_m}(K_mw -w) = 0\), and \({\text {tr}}_{\partial {\fancyscript{S}}_m^c\setminus {\fancyscript{S}}_m} K_mw = 0\).

Proof

Assume that \(w \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\), such that \({\text {tr}}_{{\fancyscript{S}}_m}w\) vanishes on the boundary of \({\fancyscript{S}}_m\). To show that \(K_m w \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\), we consider each term in the sum defining \(K_mw\) of the form

$$\begin{aligned} K_m^Iw({\lambda }) := \frac{b({\lambda })}{b({P}_I{\lambda })}w({P}_I{\lambda }). \end{aligned}$$

If \(I = \emptyset \), then \(K_m^Iw = w\), while if \(I\) is the maximum set, \(I = (0,1,\ldots ,m)\), then \(K_m^Iw({\lambda }) = b({\lambda })w(0,\ldots ,0)\) which is linear. Therefore, it is enough to consider the other choices of \(I\), i.e., when \(K_m^Iw\) has an essential rational coefficient \(b({\lambda })/b({P}_I{\lambda })\).

Note that since \({\text {tr}}_{{\fancyscript{S}}_m}w\) vanishes on the boundary of \({\fancyscript{S}}_m\), we can conclude that \(w({P}_I{\lambda })\) vanishes on \(\{{\lambda }\in {\fancyscript{S}}_m^c\, |\, b({P}_I {\lambda }) = 0 \, \}\). This means that \(w({P}_I{\lambda })\) must be of the form \(w({P}_I{\lambda }) = b({P}_I{\lambda })w'({P}_I{\lambda })\), where \(w' \in {\fancyscript{P}}_{r-1}({\fancyscript{S}}_{m,I})\). Here

$$\begin{aligned} {\fancyscript{S}}_{m,I} = \{ {\lambda }\in {\fancyscript{S}}_m^c \, |\, {P}_I{\lambda }= {\lambda }\, \}. \end{aligned}$$

As a consequence, \(K_m^Iw = b({\lambda })w'({P}_I{\lambda }) \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\). Furthermore, \({\text {tr}}_{{\fancyscript{S}}_m}K_mw\) \( = {\text {tr}}_{{\fancyscript{S}}_m} w\) since all the terms \(K_m^Iw\) have vanishing trace on \({\fancyscript{S}}_m\), except for the one corresponding to \(I = \emptyset \). Finally, the property that the trace of \(K_mw\) vanishes on the rest of the boundary of \({\fancyscript{S}}_m^c\) follows from the discussion given above. \(\square \)

3.2 The Local Averaging Operator

Throughout this section, we will assume that \(f = [{x}_0,{x}_1, \ldots ,{x}_m] \in {\varDelta }_m({\fancyscript{T}})\), where we assume that \(0 \le m < n\). For \(v \in L^2({\varOmega }_f)\) and \({\lambda }\in {\fancyscript{S}}_m^c\), we let \(A_fv({\lambda })\) be given by

where the slash through an integral means an average, i.e., \(\int _{{\varOmega }_f} - \quad \) should be interpreted as \(|{\varOmega }_f|^{-1}\int _{{\varOmega }_f}\). This operator is a generalization of the corresponding operator introduced in Section 2 above in the special case when \(f\) is a vertex. If \({\lambda }\in {\fancyscript{S}}_m\), then the integrand is independent of \({y}\), and therefore, \(A_fv({\lambda }) = v({x})\), where \({x}= \sum _j \lambda _j{x}_j \in f\). Hence, at least formally, the operator \(\lambda _f^* \circ A_f\), which is given by \(v \mapsto A_fv({\lambda }_f({\cdot }))\), is the identity operator on \(f\). We will find it convenient to introduce the function \({G}= {G}_m: {\fancyscript{S}}_m^c\times {\varOmega }_f \rightarrow {\varOmega }_f\) given by

$$\begin{aligned} {G}_m({\lambda },{y}) = {y}+ \sum _{j=0}^m \lambda _j({x}_j -{y}) = \sum _{j=0}^m \lambda _j {x}_j + b({\lambda }){y}, \quad {\lambda }\in {\fancyscript{S}}_m^c, \, {y}\in {\varOmega }_f, \end{aligned}$$

so that the operator \(A_f\) can be expressed as

In fact, we observe that for each \({y}\in {\varOmega }_f\), the map \({G}_m({\cdot }, {y})\) maps \({\fancyscript{S}}_m^c\) to \({\varOmega }_f\), and the operator \(A_f\) is simply the average with respect to \({y}\) of the pullbacks with respect to these maps. It is a property of the map \({G}_m\) that if \({y}\in T\), where \( T \in {\fancyscript{T}}_f\), then \({G}_m({\lambda },{y}) \in T\). In fact, \({G}_m({\lambda },{y})\) is a convex combination of \({y}\) and \(\left( \sum _i\lambda _i \right) ^{-1}\sum _i\lambda _i{x}_i \in f\).

A key property of the operator \(A_f\) is that it maps the piecewise polynomial spaces \(W_r({\fancyscript{T}}_f)\) into the polynomial spaces \({\fancyscript{P}}_r({\fancyscript{S}}_m^c)\).

Lemma 3.2

If \(v \in W_r({\fancyscript{T}})\), then \(A_f v \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\). Furthermore, if \({\lambda }\in {\fancyscript{S}}_m\), then \(A_f v({\lambda }) = v({x})\), where \({x}= \sum _{j=0}^m \lambda _j{x}_j \in f\).

Proof

If \(v \in W_r({\fancyscript{T}})\), then the restriction of \(v\) to each triangle in \({\fancyscript{T}}_f\) is a polynomial of degree \(r\). Furthermore, the map \({y}\mapsto {G}_m({\lambda }, {y})\) maps each \(T\) to itself and depends linearly on \({\lambda }\). Therefore, \(v({G}_m({\lambda },{y})) \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\) for each fixed \({y}\). Taking the average over \({\varOmega }_f\) with respect to \({y}\) preserves this property, so \(A_f v \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c)\). The second result follows from the fact that the integrand is independent of \({y}\) and equal to \(v(\sum _j \lambda _j{x}_j)\), for \({\lambda }\in {\fancyscript{S}}_m \). \(\square \)

We will also need mapping properties of the operator \(\lambda _f^* \circ A_f\). Since \({\lambda }_f\) maps all of \({\varOmega }\) into \({\fancyscript{S}}_m^c\), the operator \(\lambda _f^*\circ A_f\) maps a function \(v\) defined on \(L^2({\varOmega }_f)\) to \(A_fv({\lambda }_f({\cdot }))\) defined on all of \({\varOmega }\). It is a key result that this operator is bounded in \(L^2\) and \(H^1\). In fact, we even have the following.

Lemma 3.3

Assume that \(f \in {\varDelta }_m({\fancyscript{T}})\) and \(I \in {\fancyscript{I}}_m\), with \(m<n\). The operator \(\lambda _f^* \circ P_I^* \circ A_f\) is bounded as an operator from \(L^2({\varOmega }_f)\) to \(L^2({\varOmega })\), as well as from \(H^1({\varOmega }_f)\) to \(H^1({\varOmega })\).

The arguments involved to establish these boundedness results are slightly more technical than the discussion above. Therefore, we will delay the proof of this lemma and the proofs of the next three results below to the final section of the paper.

As we have observed above, the operator \(\lambda _f^* \circ A_f\) formally preserves traces on \(f\). A weak formulation of this result is expressed by the following Hardy-type inequality.

Lemma 3.4

Assume that \(f \in {\varDelta }_m({\fancyscript{T}})\) with \(m <n\). Then

$$\begin{aligned} \int _{{\varOmega }} \rho _f^{-2}({x})|v({x}) - A_fv({\lambda }_f({x}))|^2 \, \hbox {d}{x}\le c \Vert v \Vert _1^2, \quad v \in H^1({\varOmega }), \end{aligned}$$

where the constant \(c = c({\varOmega },{\fancyscript{T}})\) is independent of \(v\).

Since the function \(\rho _f({x})\) is identically zero on \(f\), this result shows that for any \(v \in H^1({\varOmega }_f)\) “the error,” \(v - A_fv\), has a decay property near \(f\).

The next result shows that the operator \(\lambda _f^* \circ P_I^* \circ A_f\) preserves such decay properties.

Lemma 3.5

Assume that \(f \in {\varDelta }_m({\fancyscript{T}})\) and \(I \in {\fancyscript{I}}_m\), with \(m<n\), and let \(g = f(I) \in {\varDelta }(f)\). There is a constant \(c=c({\varOmega },{\fancyscript{T}})\), independent of \(v\), such that

$$\begin{aligned} \int _{{\varOmega }}\rho _g^{-2}({x}) |A_fv({P}_I{\lambda }_f({x}))|^2 \, \hbox {d}{x}\le c\, \Big [\int _{{\varOmega }}\rho _g^{-2}({x}) |v({x})|^2 \, \hbox {d}{x}+ \Vert {{\text {grad}}}v \Vert _{0}^2\Big ] \end{aligned}$$

for all \(v \in H^1({\varOmega })\), such that \(\rho _g^{-1} v \in L^2({\varOmega })\).

Finally, the following lemma will be a key ingredient in the proof of Lemma 4.2 to follow.

Lemma 3.6

Assume that \(f = [{x}_0,{x}_1, \ldots {x}_m] \in {\varDelta }_m({\fancyscript{T}})\) and \(I \in {\fancyscript{I}}_m\), with \(m<n\) and such that \(0 \notin I\). Furthermore, let \(I' = (0,I)\). Then

$$\begin{aligned} \int _{{\varOmega }} \lambda _0^{-2}({x})[A_fv({P}_I{\lambda }_f({x})) - A_fv({P}_{I'}{\lambda }_f({x}))]^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}^2, \quad v \in H^1({\varOmega }_f), \end{aligned}$$

where the constant \(c= c({\varOmega },{\fancyscript{T}})\) is independent of \(v\).

We remark that \(A_fv({P}_I{\lambda }_f({x})) - A_fv({P}_{I'}{\lambda }_f({x}))= 0\) outside \({\varOmega }_{x_0}\). Therefore, the integrand in the integral above should be considered to be zero outside \({\varOmega }_{x_0}\).

4 Precise Definitions and Main Results

The transform \({\fancyscript{B}}= B_{{\fancyscript{T}}}\) will be defined by an inductive process which we now present.

4.1 Definition of the Transform

We will define the map \({\fancyscript{B}}\) by a recursion with respect to the dimension of subsimplexes \(f \in {\varDelta }({\fancyscript{T}})\). The map \({\fancyscript{B}}\) can be defined on the space \(L^2\), but the more interesting properties appear when it is restricted to \(H^1\). The main tool for constructing the operator \({\fancyscript{B}}\) are trace preserving cutoff operators \(C_f\) which map functions defined on \({\varOmega }_f\) into functions defined on all of \({\varOmega }\). The operators \(C_f\) are defined by utilizing the corresponding operators \(K_m\) defined on \({\fancyscript{S}}_m^c\). If \(f \in {\varDelta }_m({\fancyscript{T}})\), with \(m <n\), then

$$\begin{aligned} C_f v = (\lambda _f^* \circ K_m \circ A_f) v = (K_m \circ A_f) v({\lambda }_f({\cdot })). \end{aligned}$$

A more detailed representation of the operator \(C_f\) is given by

$$\begin{aligned} C_f v({x}) = \sum _{I \in {\fancyscript{I}}_m}(-1)^{|I|}\frac{\rho _f({x})}{\rho _{f(I)}({x})}A_f v({P}_I{\lambda }_f({x})), \end{aligned}$$
(4.1)

where we recall that \(f(I)= \{ {x}\in f \, | \, {P}_I{\lambda }_f({x})= {\lambda }_f({x}) \, \}\). Observe that \({\lambda }_f\) \( \equiv (0,\ldots ,0)\) outside \({\varOmega }_f^e\) and that all functions of the form \(K_mw\) are zero at the origin in \(\mathbb {R}^{m+1}\). As a consequence, \({\text {supp}}(C_f v)\) is contained in the closure of \({\varOmega }_f^e\). For the final case when \(f \in {\varDelta }_n({\fancyscript{T}}) = {\fancyscript{T}}\), we simply define the operator \(C_f\) to be the restriction to \(f\), i.e., \(C_fv =v|_f\).

If \(f \in {\varDelta }_0({\fancyscript{T}})\), i.e., \(f\) is a vertex, then \(B_f = C_f\). More generally, for each \(f \in {\varDelta }_m({\fancyscript{T}})\) we define

$$\begin{aligned} B_fu = C_f u^m, \quad \text {where }u^m = \left( u - \sum _{\mathop {j<m}\limits ^{g \in {\varDelta }_j({\fancyscript{T}})}}B_gu\right) . \end{aligned}$$
(4.2)

Alternatively, the functions \(u^m\) satisfy \(u^0=u\) and the recursion

$$\begin{aligned} u^{m+1} = u^m - \sum _{f \in {\varDelta }_m({\fancyscript{T}})} C_f u^m = u^m - \sum _{f \in {\varDelta }_m({\fancyscript{T}})} B_f u. \end{aligned}$$

As a consequence of the definition of the operator \(C_f\) for \(\dim f = n\), it follows by construction that \(u = \sum _f B_fu\). Furthermore, from the corresponding property of the operator \(C_f\), it also follows that \({\text {supp}}(B_f u)\) is in the closure of \({\varOmega }_f^e\). Also, by Lemma 3.3 and from the fact that \(\rho _f/\rho _{f(I)} \le 1\), it follows directly that the operator \(B_f\) is bounded in \(L^2\). However, it is more challenging to establish that \(B_f\) is bounded in \(H^1\), and that \(B_fu \in \mathring{H}^1({\varOmega }_f)\) for \(u \in H^1({\varOmega })\).

4.2 Main Properties of the Transform

The main arguments needed for verifying the properties (i)–(iii) of the transform \({\fancyscript{B}}\), stated in Section 2 above, will be given here. We will first establish that the piecewise polynomial space, \(W_r({\fancyscript{T}})\), is preserved by the transform, i.e., we will show property (iii).

Theorem 4.1

If \(u \in W_r({\fancyscript{T}})\), then \(B_f u \in \mathring{W}_r({\fancyscript{T}}_f)\) for all \(f \in {\varDelta }({\fancyscript{T}})\).

Proof

Assume that \(u \in W_r({\fancyscript{T}})\). We will show that for all \(m\), \(0 \le m \le n\), the following properties hold:

$$\begin{aligned} u^m \in W_r({\fancyscript{T}}), \quad \text {with } {\text {tr}}_g u^m = 0, \quad g \in {\varDelta }_j({\fancyscript{T}}), \, j < m, \end{aligned}$$
(4.3)

and

$$\begin{aligned} B_g u \in \mathring{W}_r({\fancyscript{T}}_g), \, g \in {\varDelta }_j({\fancyscript{T}}), \, j < m. \end{aligned}$$
(4.4)

Here the function \(u^m\) is defined by (4.2). The proof of (4.3) and (4.4) goes by induction on \(m\). Note that for \(m=0\), these properties hold with \(u^0 = u\). Assume now that (4.3) and (4.4) hold for a given \(m\), \(m <n\). Let \( v \equiv u^m \in W_r({\fancyscript{T}})\). Then, for any \(f = [{x}_0,{x}_1, \ldots {x}_m]\in {\varDelta }_m({\fancyscript{T}})\), we have \({\text {tr}}_f v \in \mathring{\fancyscript{P}}_r(f)\). Therefore, it follows from Lemma 3.2 that

$$\begin{aligned} A_fv \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c) \quad \text {and }{\text {tr}}_{{\fancyscript{S}}_m}A_fv \in \mathring{\fancyscript{P}}_r({\fancyscript{S}}_m). \end{aligned}$$

In fact, if \({\lambda }\in {\fancyscript{S}}_m\), then \(A_f v({\lambda }) = v({x})\), where \({x}= \sum _{j=0}^m \lambda _j{x}_j \in f\). But from Lemma 3.1, we can then conclude that

$$\begin{aligned} (K_m\circ A_f)v \in {\fancyscript{P}}_r({\fancyscript{S}}_m^c), \quad \!\! \text {with } {\text {tr}}_{{\fancyscript{S}}_m}(I - K_m)A_fv \!=\! 0, \quad \! {\text {tr}}_{\partial {\fancyscript{S}}_m^c\setminus {\fancyscript{S}}_m}(K_m\circ A_f)v \!=\! 0. \end{aligned}$$

However, this implies that

$$\begin{aligned} B_fu = C_f^m u^m = (K_m\circ A_f)v({\lambda }_f({\cdot })) \in \mathring{W}_r({\fancyscript{T}}_f), \end{aligned}$$

and with \({\text {tr}}_f B_f u = {\text {tr}}_f u^m\). This property holds for all \(f \in {\varDelta }_m({\fancyscript{T}})\). Therefore, since

$$\begin{aligned} u^{m+1} = u^m - \sum _{f \in {\varDelta }_m({\fancyscript{T}})} B_fu, \end{aligned}$$

we can conclude that (4.3) and (4.4) hold with \(m\) replaced by \(m+1\). This completes the induction argument. In particular, we have shown that \(B_f u \in \mathring{W}_r({\fancyscript{T}}_f)\) for all \(f \in {\varDelta }_m({\fancyscript{T}})\), \(m < n\). Furthermore, \({\text {tr}}_f u^n = 0\) for all \(f \in {\varDelta }_{n-1}({\fancyscript{T}})\). This means that

$$\begin{aligned} u^n = \sum _{T \in {\fancyscript{T}}} u_T^n, \quad u_T^n \in \mathring{W}_r(T), \, T \in {\fancyscript{T}}. \end{aligned}$$

Since \(B_T u = u_T^n\) for any \(T \in {\varDelta }_n({\fancyscript{T}}) = {\fancyscript{T}}\), the proof is completed. \(\square \)

The next result will be a key step for showing properties (i) and (ii) of the transform.

Lemma 4.1

Assume that \(f \in {\varDelta }_m({\fancyscript{T}})\), with \(m <n\), and that \(v \in H^1({\varOmega }_f)\) with \(\rho _g^{-1}v \in L^2({\varOmega }_f)\), where \(g = f(I)\) for \(I \in {\fancyscript{I}}_m\). Define \(w = (\rho _f/\rho _g) A_fv({P}_I{\lambda }_f({\cdot }))\). Then \(w \in H^1({\varOmega })\) and \(\rho _f^{-1}w \in L^2({\varOmega })\).

Proof

Since \(g \in {\varDelta }(f)\), \(\rho _f/\rho _g \le 1\). Therefore, it follows directly from Lemma 3.3 that \(w \in L^2({\varOmega })\). We also have from Lemma 3.5 that

$$\begin{aligned} \int _{{\varOmega }}|\rho _f^{-1}w|^2 \, \hbox {d}{x}&= \int _{{\varOmega }}|\rho _g^{-1}A_fv({P}_I{\lambda }_f({x}))|^2 \, \hbox {d}{x}\\&\le c \Big [\int _{{\varOmega }_f}|\rho _g^{-1}v({x})|^2 \, \hbox {d}{x}+ \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}^2\Big ] < \infty , \end{aligned}$$

so the desired decay property of \(w\) follows. It remains to show that \(w \in H^1({\varOmega })\). From the identity

$$\begin{aligned} {{\text {grad}}}(\rho _f/\rho _g) = \rho _g^{-1}({{\text {grad}}}\rho _f - \frac{\rho _f}{\rho _g}{{\text {grad}}}\rho _g), \end{aligned}$$

we obtain that \(|{{\text {grad}}}(\rho _f/\rho _g)| \le c_0 \rho _g^{-1}\), where \(c_0 = c_0({\varOmega }, {\fancyscript{T}})\). Therefore, we can conclude that

$$\begin{aligned} \int _{{\varOmega }_f} |({{\text {grad}}}(\rho _f/\rho _g))A_fv({P}_I{\lambda }({x}))|^2 \, \hbox {d}{x}\le c_0^2\int _{{\varOmega }_f}|\rho _g^{-1}A_fv({P}_I{\lambda }({x}))|^2 \, \hbox {d}{x}. \end{aligned}$$

Together with Leibnitz’ rule and the result of Lemma 3.3, this will imply that \(w \in H^1({\varOmega })\). This completes the proof. \(\square \)

Lemma 4.2

Let \(f \in {\varDelta }_m({\fancyscript{T}})\) with \({x}_0 \in {\varDelta }_0(f)\). Assume that \(v \in H^1({\varOmega }_f)\), with the property that \(\rho _g^{-1}v \in L^2({\varOmega }_f)\) for all \(g \in {\varDelta }_j(f)\), \(j< m\). Then \(\lambda _0^{-1} C_f v \in L^2({\varOmega })\).

Proof

Assume first that \(m <n\). Let \(I \in {\fancyscript{I}}_m\) be any index set such that \(0 \notin I\). Furthermore, let \(I' = (0,I) \in {\fancyscript{I}}_m\). In other words, \({x}_0 \in {\varDelta }(g)\) while \({x}_0 \notin {\varDelta }(g')\), where \(g = f(I)\) and \(g' = f(I')\). The desired result will follow if we can show that

$$\begin{aligned}&\lambda _0^{-1} \Big [\frac{\rho _f}{\rho _g}A_fv({P}_I{\lambda }_f({\cdot })) - \frac{\rho _f}{\rho _{g'}} A_fv({P}_{I'}{\lambda }_f({\cdot }))\Big ]\\&\quad = \lambda _0^{-1} \frac{\rho _f}{\rho _g}\Big [A_fv({P}_I{\lambda }_f({\cdot })) -A_fv({P}_{I'}{\lambda }_f({\cdot }))\Big ] + \frac{\rho _f}{\rho _g\rho _{g'}}A_fv({P}_{I'}{\lambda }_f({\cdot })) \in L^2({\varOmega }). \end{aligned}$$

However, Lemma 3.6 and the fact that \(\rho _f/\rho _g \le 1\) imply that the first term on the right hand side is in \(L^2\). Furthermore, it follows by assumption that \(\rho _{g'}^{-1}v \in L^2\), and therefore, Lemma 3.5 implies that the second term is in \(L^2\).

If \(m=n\), then we recall that \(C_f v\) is just \(v\) restricted to \(f\). If \(f =[{x}_0,{x}_1, \ldots ,{x}_n]\) and \(g = [{x}_1, \ldots ,{x}_n]\), then \(\rho _g^{-1} v = \lambda _0^{-1} v \in L^2\) by assumption. This completes the proof. \(\square \)

The following result will be used to show that the components \(B_fu\) are elements of \(\mathring{H}^1({\varOmega }_f)\). The arguments given in the proof are closely related to characterizations of \(\mathring{H}^1\) space in terms of distance-weighted \(L^2\) norms, cf. for example [18, Chapter 1, Theorem 11.8].

Lemma 4.3

Let \(f = [{x}_0, {x}_1,\ldots ,{x}_m] \in {\varDelta }_m({\fancyscript{T}})\) and assume that \(v \in H^1({\varOmega }_f)\), with the property that \(\rho _g^{-1}v \in L^2({\varOmega }_f)\) for \(g \in {\varDelta }_j(f)\), \(j< m\). Define \(w = C_f v\). Then \(w|_{{\varOmega }_f} \in \mathring{H}^1({\varOmega }_f)\) and \(w \equiv 0\) on \({\varOmega }\setminus {\varOmega }_f\).

Proof

We first observe that \(w|_{{\varOmega }_f} \in H^1({\varOmega }_f)\). This is obvious if \(m = n\), while for \(m <n\) it follows from Lemma 4.1 that all the terms in the series of \((K_m \circ A_f)v({\lambda }_f({\cdot }))\) have this property. To show that \(w \in \mathring{H}^1({\varOmega }_f)\), it is enough to show that for any vertex \({x}_0\) of \(f\), \(w \in \mathring{H}^1({\varOmega }_{x_0})\). Since the numbering of the vertices of \(f\) is arbitrary, this will in fact imply that

$$\begin{aligned} w \in \cap _{j=0}^m \mathring{H}^1({\varOmega }_{x_j}) = \mathring{H}^1({\varOmega }_f). \end{aligned}$$

However, the property that \(w \in \mathring{H}^1({\varOmega }_{x_0})\) is a consequence of the decay results expressed in Lemma 4.2, i.e., that \(\lambda _0^{-1}w \in L^2\). For any \(\varepsilon > 0\), let \(\phi _{\varepsilon }\) be a smooth function on \(\mathbb {R}\) such that \(\phi _{\varepsilon } \equiv 0\) on \((-\varepsilon /2,\varepsilon /2)\), \(\phi _{\varepsilon } \equiv 1\) on the complement of \((-\varepsilon ,\varepsilon )\), and such that \(\phi '_{\varepsilon }(\lambda )\lambda \) is uniformly bounded, i.e.,

$$\begin{aligned} |\phi '_{\varepsilon }(\lambda )| \le c/|\lambda |, \quad \frac{\varepsilon }{2} \le |\lambda | \le \varepsilon , \end{aligned}$$
(4.5)

for some constant \(c\). By construction, the functions \(v_{\varepsilon }\equiv \phi _{\varepsilon }(\lambda _0({\cdot })) w\) are in \(\mathring{H}^1({\varOmega }_{x_0})\), and to show that \(w\) belongs to the same space, it is enough to show that the \(v_{\varepsilon } \) converge to \(w\), as \(\varepsilon \) tends to zero, in \(H^1({\varOmega }_{x_0})\). However,

$$\begin{aligned} \int _{{\varOmega }_{x_0}}|v_{\varepsilon } - w|^2 \, \hbox {d}{x}= \int _{{\varOmega }_{x_0}}|[\phi _{\varepsilon }(\lambda _0({\cdot })) - 1]w|^2 \, \hbox {d}{x}\le \int _{{\varOmega }_{x_0,\varepsilon }} |w|^2 \, \hbox {d}{x}\rightarrow 0, \end{aligned}$$

where \({\varOmega }_{x_0,\varepsilon } = \{{x}\in {\varOmega }_{x_0} \, | \, \lambda _0({x}) \le \varepsilon \, \}\). This shows \(L^2\) convergence. Furthermore,

$$\begin{aligned} \int _{{\varOmega }_{x_0}}|{{\text {grad}}}(v_{\varepsilon } - w)|^2 \, \hbox {d}{x}\le 2 \int _{{\varOmega }_{x_0,\varepsilon }}|{{\text {grad}}}w|^2 \, \hbox {d}{x}+ 2 \int _{{\varOmega }_{x_0,\varepsilon }}|({{\text {grad}}}[\phi _{\varepsilon }(\lambda _0({\cdot }))] w|^2 \, \hbox {d}{x}. \end{aligned}$$

The first term goes to zero by the \(H^1\) boundedness of \(w\), and as a consequence of (4.5) and the \(L^2\) property of \(\lambda _0^{-1}w\) established in Lemma 4.2, the second term goes to zero with \(\varepsilon \). By completeness of \(\mathring{H}^1({\varOmega }_{x_0})\), it follows that \(w \in \mathring{H}^1({\varOmega }_{x_0})\), and therefore, it is in \(\mathring{H}^1({\varOmega }_f)\).

We recall from the definition of the operator \(C_f\) that \(w\) is identically zero on \({\varOmega }\setminus {\varOmega }_f^e\). Hence, it remains to show that \(w\) is identically zero on \({\varOmega }_f^e \setminus {\varOmega }_f\) when \(m <n\). However, at each point in \({\varOmega }_f^e \setminus {\varOmega }_f\), at least one of the extended barycentric coordinates associated with \(f\) is zero. Therefore, \(w\) in this region corresponds to a pullback of \(w\) from \(\partial {\fancyscript{S}}_m^c\setminus {\fancyscript{S}}_m\), and this is zero since \({\text {tr}}_{\partial {\varOmega }_f}w = 0\). \(\square \)

Lemma 4.4

Let \(u \in H^1({\varOmega })\) and define the functions \(u^m\), \(0 \le m \le n\), by (4.2). Then \(u^m \in H^1({\varOmega })\) and \(\rho _f^{-1} u^m \in L^2({\varOmega })\) for all \(f \in {\varDelta }_j({\fancyscript{T}})\), \(j<m\).

Proof

The proof goes by induction on \(m\). For \(m=0\), the result holds with \(u^0 = u\). Furthermore, if the result holds for a given \(m <n\), then \(u^{m+1}\in H^1({\varOmega })\) by Lemma 4.3. It remains to show the decay property, i.e., that \(\rho _f^{-1}u^{m+1} \in L^2({\varOmega })\) for all \(f \in {\varDelta }_j({\fancyscript{T}})\) for \(j \le m\). For any \(f \in {\varDelta }_m({\fancyscript{T}})\) we have

$$\begin{aligned} \rho _f^{-1}(u^m - C_f u^m) = \rho _f^{-1}[u^m - A_fu^m({\lambda }_f({\cdot }))] - \rho _f^{-1}\sum _{\mathop {I \ne \emptyset }\limits ^{I \in {\fancyscript{I}}_m}} (-1)^{|I|}\frac{\rho _f}{\rho _{f(I)}} A_f u^m({P}_I{\lambda }({\cdot })). \end{aligned}$$

However, the first term on the right side is in \(L^2\) as a consequence of Lemma 3.4, while Lemma 4.1 and the induction hypothesis implies that all the terms in the sum are in \(L^2\). We can therefore conclude that for \(f \in {\varDelta }_m\), \(\rho _f^{-1}(u^m - C_f u^m)\) is in \(L^2({\varOmega })\). To show that \(\rho _f^{-1} u^{m+1}\) is in \(L^2\), we express this as

$$\begin{aligned} \rho _f^{-1} u^{m+1} = \rho _f^{-1}(u^m - C_f u^m) + \sum _{\mathop {g \ne f}\limits ^{g \in {\varDelta }_m({\fancyscript{T}})}}\rho _f^{-1} C_g u^m. \end{aligned}$$
(4.6)

Recall that by definition, \(C_g u^m\) is identically zero outside \({\varOmega }_g^e\). On the other hand, if \(g \in {\varDelta }_m({\fancyscript{T}})\) and \(g \ne f\), then on each \(T \in {\fancyscript{T}}\), such that \(f \cap T \ne \emptyset \) and \(g \cap T \ne \emptyset \), there exists a vertex \({x}_0 \in g\cap T\) which is not in \(f\). Then \(\lambda _0 \le \rho _f\) on \(T\), which implies that

$$\begin{aligned} |\rho _f^{-1} C_g u^m| \le |\lambda _0^{-1} C_g u^m| \quad \text {on } T. \end{aligned}$$

By repeating this for all \(T \subset {\varOmega }_f^e\), and by applying Lemma 4.2, we obtain that all the terms in the sum (4.6) are in \(L^2\). Since \(f \in {\varDelta }_m({\fancyscript{T}})\) is arbitrary, this shows the desired decay result for all \(f \in {\varDelta }_m({\fancyscript{T}})\). However, if \(g \in {\varDelta }(f)\), then \(\rho _g^{-1}({x}) \le \rho _f^{-1}({x})\), and therefore, \(\rho _f^{-1}u^{m+1} \in L^2\) for all \(f \in {\varDelta }_j({\fancyscript{T}})\), \(j\le m\). This completes the induction argument and therefore the proof of the lemma. \(\square \)

The following result shows that the transform satisfies properties (i) and (ii) above.

Theorem 4.2

Assume that \(u \in H^1({\varOmega })\). Then \( u = \sum _{f \in {\varDelta }({\fancyscript{T}})}B_f u\), where \(B_fu \in \mathring{H}^1({\varOmega }_f)\) for each \(f \in {\varDelta }({\fancyscript{T}})\). Furthermore, the transformation \({\fancyscript{B}}_{{\fancyscript{T}}} : H^1({\varOmega }) \rightarrow \bigoplus _{f \in {\varDelta }({\fancyscript{T}})}\mathring{H}^1({\varOmega }_f)\), with components \(B_f\), is bounded.

Proof

We have already seen that \( u = \sum _{f \in {\varDelta }({\fancyscript{T}})} B_f u\). Furthermore, it is a consequence of Lemmas 4.3 and 4.4 that each \(B_f u \in \mathring{H}^1({\varOmega }_f)\). Finally, the boundedness of the transformation can be seen by tracing the bounds derived in Lemmas 4.14.4 and by utilizing the finite overlap property of the covering \(\{{\varOmega }_f \}\) of \({\varOmega }\). \(\square \)

Corollary 1

The transform \({\fancyscript{B}}_{{\fancyscript{T}}}\) is \(L^2\) bounded, with \({\text {supp}}B_fu\) contained in the closure of \({\varOmega }_f\) for all \(u \in L^2({\varOmega })\).

Proof

We have already seen that \({\fancyscript{B}}_{{\fancyscript{T}}}\) is \(L^2\) bounded, and with \({\text {supp}}B_fu\) contained in the closure of the extended macroelement \({\varOmega }_f^e\). However, due to the result of Theorem 4.2 and the density of \(H^1({\varOmega })\) in \(L^2({\varOmega })\), this implies that \({\text {supp}}B_fu\) is contained in the closure of \({\varOmega }_f\). \(\square \)

4.3 Construction of Projections

The result of Theorem 4.2 leads immediately to the construction of locally defined projections into the finite element spaces \(W_r({\fancyscript{T}})\), which are uniformly bounded with respect to the polynomial degree \(r\). We just project each component \(B_fu\) into the space \(\mathring{W}_r({\fancyscript{T}}_f)\) by a local projection \(Q_{f,r}\). More precisely, the locally defined global projections \(\pi = \pi _{{\fancyscript{T}},r}\) will be of the form

$$\begin{aligned} \pi u = \sum _{f \in {\varDelta }_m({\fancyscript{T}})}Q_{f,r} B_f u, \end{aligned}$$

where \(Q_{f,r}\) is a local projection onto \(\mathring{W}_r({\fancyscript{T}}_f)\). The operator \(\pi \) will be a projection as a result of Theorem 4.1. If \(Q_{f,r}\) is taken to be the local \(H^1\)-projection, with corresponding operator norm equal to one, then Theorem 4.2 implies that \(\pi \) will be uniformly bounded in \(H^1\) with respect to \(r\). On the other hand, if \(Q_{f,r}\) is taken to be the local \(L^2\)-projection, then Corollary 1 implies uniform \(L^2\) boundedness of \(\pi \) with respect to \(r\).

5 Proofs of Lemmas 3.33.6

To complete the paper, it remains to establish Lemmas 3.33.6, all related to properties of the averaging operators \(A_f\). Recall that it is a property of the triangulation \({\fancyscript{T}}\) of \({\varOmega }\) that the intersection of two elements of \({\fancyscript{T}}\) is either empty or a common subsimplex of each. It is a consequence of this that any simplex \(f \in {\varDelta }({\fancyscript{T}})\), which is not contained in the boundary \(\partial {\varOmega }\), has the property that all its interior points are also in the interior of \({\varOmega }\). In other words, any element of \({\varDelta }({\fancyscript{T}})\) is either contained in the boundary \(\partial {\varOmega }\) or all its interior points are interior points of \({\varOmega }\).

Let \(f = [{x}_0,{x}_1, \ldots ,{x}_m] \in {\varDelta }_m({\fancyscript{T}})\) be as above. Throughout this section we assume that \(0 \le m < n\). If \(T \in {\fancyscript{T}}_f\), and \({\lambda }\in {\fancyscript{S}}_m^c\), we also let

such that

$$\begin{aligned} A_fv = \sum _{T \in {\fancyscript{T}}_f}\frac{|T|}{|{\varOmega }_f|}A_{f,T}v. \end{aligned}$$

Before we derive more properties of the operator \(A_f\), we will make some observations, which will be useful below. A simple calculation shows that for any \(r \in \mathbb {R}\) we have

$$\begin{aligned} \int _{{\fancyscript{S}}_m^c} b({\lambda })^r \, \hbox {d}{\lambda }&= \int _{{\fancyscript{S}}_{m-1}^c}\int _0^{b({\lambda }')}(b({\lambda }') - \lambda _m)^r \, \hbox {d}\lambda _m \, \hbox {d}{\lambda }'\\&= \int _{{\fancyscript{S}}_{m-1}^c}\int _0^{b({\lambda }')}z^r \, \hbox {d}z\, \hbox {d}{\lambda }'\\&= \int _0^{1}z^r \int _{z \le b({\lambda }')} \, \hbox {d}{\lambda }' \, \hbox {d}z = |{\fancyscript{S}}_{m-1}^c| \int _0^{1}z^r(1 - z)^m \, \hbox {d}z. \end{aligned}$$

Hence, we can conclude that

$$\begin{aligned} \int _{{\fancyscript{S}}_m^c} b({\lambda })^r \, \hbox {d}{\lambda }< \infty , \quad \text {for } r > -1. \end{aligned}$$
(5.1)

If \(f = [{x}_0,{x}_1,\ldots {x}_m] \in {\varDelta }_m({\fancyscript{T}})\) and \(T\) is an element of \({\fancyscript{T}}_{f}\), we let \(f^*(T)\) \( \in {\varDelta }_{n-m-1}(T)\) be the face opposite \(f\). In other words, if \(T = [{x}_0,{x}_1, \ldots ,{x}_n]\), then

$$\begin{aligned} f^*(T) = [{x}_{m+1},\ldots , {x}_n] = \{ {x}\in T \, |\, \lambda _j({x}) = 0,\, j=0,1,\ldots ,m\, \}. \end{aligned}$$

Any point \({x}\in T\) can be written uniquely as a convex combination of \({x}_0, \ldots ,{x}_m\) and a point \({q}= {q}_f \in f^*(T)\), since

$$\begin{aligned} {x}= \sum _{j=0}^n \lambda _j({x}){x}_j = \sum _{j=0}^m \lambda _j({x}){x}_j + \rho _f({x}){q}_f({x}), \quad {q}_f({x}) = \sum _{j=m+1}^n \lambda _j({x})x_j/\rho _f({x}). \end{aligned}$$

Define \(f^* = \cup _{T \in {\fancyscript{T}}_{f}}f^*(T)\). Then \(f^* \subset \partial {\varOmega }_f\), and any \({x}\in {\varOmega }_f\) can be written as

$$\begin{aligned} {x}= \sum _{j=0}^m \lambda _j({x}){x}_j + \rho _f({x}){q}_f({x}), \quad {q}_f({x}) \in f^*. \end{aligned}$$
(5.2)

The set \(f^*\) can alternatively be characterized as \(f^* = \partial {\varOmega }_f^e \cap \partial {\varOmega }_f\). An illustration of the geometry of \(f\), \({\varOmega }_f\), and \(f^*\) is given in Fig. 5 below.

Fig. 5
figure 5

The macroelement \({\varOmega }_f \subset \mathbb {R}^3\), where \(f\) is the line from \({x}_0\) to \({x}_1\) and \(f^*\) is the closed curve connecting \({x}_2,{x}_3,{x}_4\)

In fact, if \(m=n-1\) and \(f\) is an interior simplex, then \(f^*\) consist of two vertices in \({\varDelta }_0({\fancyscript{T}})\), while \(f^*\) is a single vertex if \(f \subset \partial {\varOmega }\). On the other hand, if \(m <n-1\) and \(f\) is not contained in the boundary, then \(f^*\) is a closed, connected and piecewise flat manifold of dimension \(n-m-1\). In the case when \(f \subset \partial {\varOmega }\), the manifold \(f^*\) is still connected.

Lemma 5.1

Assume that \(f \in {\varDelta }({\fancyscript{T}}) \cap \partial {\varOmega }\). Then \(f^*\) is connected.

Proof

Let \({q}_0\) and \({q}_1\) be two points on the manifold \(f^*\). We need to show that these points can be connected by a continuous curve in \(f^*\). For any \(s \in (0,1)\) the points \({y}_i\), \(i=0,1\), given by

$$\begin{aligned} {y}_i = \frac{1-s}{m+1}\sum _{j=0}^m {x}_j + s{q}_i \end{aligned}$$

are in \({\varOmega }_f\), and can be made arbitrarily close to the barycenter of \(f\), \({x}_f\), by choosing \(s\) sufficiently small. Since the polyhedral domain \({\varOmega }\) is, in particular, a Lipschitz domain, it follows that the two points \({y}_0\) and \({y}_1\) can be connected by a continuous curve

$$\begin{aligned} \{{y}(t)\, | \, t \in [0,1] \, \} \subset {\varOmega }, \end{aligned}$$

such that \({y}(0) = {y}_0\), \({y}(1) = {y}_1\). Furthermore, the curve can be made arbitrary close to the barycenter \({x}_f\) by adjusting the parameter \(s\) and the chosen curve. However, since the barycenter is an interior point of \(f\) for \(m>0\), all points in \({\varOmega }\) which are sufficiently close to \({x}_f\) are also in \({\varOmega }_f\). Therefore, by applying the representation (5.2), we obtain that the curve \({y}(t)\) is of the form

$$\begin{aligned} {y}(t) = \sum _{j=0}^m \lambda _j({y}(t)){x}_j + b({\lambda }({y}(t) )){q}(t), \end{aligned}$$

where \({q}(t) = {q}_f({y}(t)) \in f^*\). Since \(\lambda _j({y}_0) = (1-s)/(m+1)\) for \(j= 0,1, \ldots m\), and hence, \(b({\lambda }(y_0)) = s\), it follows easily from the identities \({y}(0) = {y}_0\) and \({y}(1) = {y}_1\) that \({q}(0)= {q}_0\) and \({q}(1)= {q}_1\). This completes the proof. \(\square \)

The map \({x}\mapsto ({\lambda }_f({x}),{q}_f({x}))\) defines a map from \({\varOmega }_f\) to \({\fancyscript{S}}_m^c\times f^*\), with an inverse given by

$$\begin{aligned} ({\lambda },{q}) \mapsto {x}= {q}+ \sum _{j=0}^m \lambda _j ({x}_j - {q}) = {G}_m({\lambda },{q}). \end{aligned}$$
(5.3)

To express the derivative of the map, we write \({q}\in f^*(T)\) in the form \({q}= \hat{q}\) \( + \sum _{i=m+1}^{n-1} q_i {t}_i\), where \(\hat{q}\) is the barycenter of \(f^*(T)\) and \({t}_{m+1}, \ldots , {t}_{n-1} \in \mathbb {R}^n\) is an orthonormal basis for the tangent space of \(f^*(T)\). Then the derivative of the map (5.3), with respect to \({\lambda }\) and \({q}\), can be expressed as the \(n \times n\) matrix

$$\begin{aligned}{}[{x}_0-{q},{x}_1 -{q}, \ldots , {x}_m-{q}, b({\lambda }) {t}_{m+1}, \ldots , b({\lambda }) {t}_{n-1} ]. \end{aligned}$$

Hence, by the scaling rule for determinants and manipulating columns, the determinant of this matrix is equal to

$$\begin{aligned} b({\lambda })^{n-m-1}\det ([{x}_0-\hat{q}, \ldots , {x}_{m} - \hat{q}, {t}_{m+1}, \ldots , {t}_{n-1}]) := b({\lambda })^{n-m-1} J(f,{q}). \end{aligned}$$

For each \(T \in {\fancyscript{T}}_f\), \(J(f,{q})\) is a constant, i.e., \(J(f,\cdot )\) is a piecewise constant function on \(f^*\). Therefore, for the fixed mesh \({\fancyscript{T}}\), there exist constants \(c_i= c_i({\varOmega },{\fancyscript{T}}) >0\), such that

$$\begin{aligned} c_0 \le J(f,{q}) \le c_1, \quad f \in {\varDelta }({\fancyscript{T}}),\, {q}\in f^*. \end{aligned}$$
(5.4)

The coordinates \(({\lambda },{q}) \in {\fancyscript{S}}_m^c\times f^*\) can be seen as generalized polar coordinates for the domain \({\varOmega }_f\). The change of variables

$$\begin{aligned} {x}\mapsto ({\lambda }_f({x}),{q}_f({x})) \in {\fancyscript{S}}_m^c\times f^* \end{aligned}$$

leads to the identity

$$\begin{aligned} \int _{T} \phi ({\lambda }_f({x}),{q}_f({x})) \, \hbox {d}{x}= \int _{{\fancyscript{S}}_m^c} \int _{f^*(T)} \phi ({\lambda },{q}) J(f,{q})\, \hbox {d}{q}\, b({\lambda })^{n-m-1} \, \hbox {d}{\lambda }, \end{aligned}$$
(5.5)

for any \(T \in {\fancyscript{T}}_f\), and any real-valued function \(\phi \) on \({\fancyscript{S}}_m^c \times f^*(T)\). Here \(\hbox {d}{q}\) means integration with respect to the standard Lebesgue measure derived from the embedding of the tangent space of \(f^*(T)\) into \(\mathbb {R}^{n-m-1}\). Furthermore, by summing over all \(T \in {\fancyscript{T}}_f\), we obtain

$$\begin{aligned} \int _{{\varOmega }_f} \phi ({\lambda }_f({x}),{q}_f({x})) \, \hbox {d}{x}= \int _{{\fancyscript{S}}_m^c} \int _{f^*} \phi ({\lambda },{q}) J(f,{q})\, \hbox {d}{q}\, b({\lambda })^{n-m-1} \, \hbox {d}{\lambda }, \end{aligned}$$
(5.6)

where the integral over \(f^*\) should be interpreted as a sum over the two points of \(f^*\) in the case \(m=n-1\).

The function \({G}_m\) has the property that \({G}_m({\lambda }_f({x}),{q}_f({x})) = {x}\), and it satisfies the composition rule

$$\begin{aligned} {G}_m({\lambda }, {G}_m({\mu },{y})) = {G}_m({\lambda }',{y}) \quad \text {where } {\lambda }' = {\lambda }+ b({\lambda }){\mu }. \end{aligned}$$
(5.7)

In particular, the matrix associated with the linear transformation \({\lambda }\mapsto {\lambda }'\) is \((m+1) \times (m+1)\) given by \(I - {\mu }{e}^T\), where \({e}\) denotes the vector with all elements equal \(1\). Using the formula \(\det (I + {x}{y}^T) = 1 + {y}\cdot {x}\) (which we will use on several occasions in the remainder of the paper), this matrix has determinant \(b({\mu })\). Furthermore, \(b({\lambda }')\) \( = b({\lambda })b({\mu })\). Letting \({y}= {G}_m({\mu },{q})\) and applying the identity (5.5) in the variable \({y}\), we can rewrite \(A_{f,T}v({\lambda })\) as

$$\begin{aligned} A_{f,T}v({\lambda }) = |T|^{-1} \int _{{\fancyscript{S}}_m^c} \int _{f^*(T)} v ({G}_m({\lambda },{G}_m({\mu },{q})) J(f,{q}) \, \hbox {d}{q}\, b({\mu })^{n-m-1}\, \hbox {d}{\mu }, \end{aligned}$$
(5.8)

A key property, which is a special case of Lemma 3.3, is that the operator \(\lambda _f^* \circ A_{f,T}\) is bounded in \(L^2\). To see this, observe that we obtain from (5.4), (5.6), (5.7), and Minkowski’s inequality in the form \(\Vert \int g(\mu ) \, \hbox {d}\mu \Vert \le \int \Vert g(\mu )\Vert \, \hbox {d}\mu \), that

$$\begin{aligned} \Vert&A_{f,T}v({\lambda }_f({\cdot }))\Vert _{0,{\varOmega }_f} \\&\le c \int _{{\fancyscript{S}}_m^c} \Big (\int _{{\varOmega }_f} \int _{f^*(T)} |v({G}({\lambda }_f({x}),G({\mu },{q}))|^2 \, \hbox {d}{q}\, \hbox {d}{x}\Big )^{1/2} b({\mu })^{n-m-1}\, \hbox {d}{\mu }\\&\le c \int _{{\fancyscript{S}}_m^c} \Big (\int _{{\fancyscript{S}}_m^c} b({\lambda })^{n-m-1}\int _{f^*(T)} |v({G}({\lambda },{G}({\mu },{q}))|^2 \, \hbox {d}{q}\, \hbox {d}{\lambda }\Big )^{1/2} {b({\mu })}^{n-m-1}\, \hbox {d}{\mu }\\&\le c \int _{{\fancyscript{S}}_m^c} \Big (\int _{{\fancyscript{S}}_m^c} b({\lambda }')^{n-m-1} \int _{f^*(T)}|v({G}({\lambda }',{q}))|^2 \, \hbox {d}{q}\, \hbox {d}{\lambda }' \Big )^{1/2} {b({\mu })}^{-1 +(n-m)/2}\, \hbox {d}{\mu }, \end{aligned}$$

where we have substituted \({\lambda }' = {\lambda }+ b({\lambda }){\mu }\). However, by letting \(({\lambda }',{q}) \mapsto {x}={G}({\lambda }',{q})\), we obtain from (5.5) that

$$\begin{aligned} \Vert A_{f,T}v({\lambda }_f({\cdot }))\Vert _{0,{\varOmega }_f}&\le c \int _{{\fancyscript{S}}_m^c}\left( \int _{T} |v({x})|^2 \, \hbox {d}{x}\right) ^{1/2} \, b({\mu })^{-1 +(n-m)/2}\, \hbox {d}{\mu }\\&= c \Vert v \Vert _{0,T} \int _{{\fancyscript{S}}_m^c} b({\mu })^{-1 +(n-m)/2}\, \hbox {d}{\mu }\le c_1 \Vert v \Vert _{0,T}, \end{aligned}$$

where we have used (5.1) and the fact that the exponent satisfies \(-1 + (n-m)/2 \ge -1/2\). This shows that the operator \(\lambda _f^*\circ A_{f,T}\) is bounded as an operator from \(L^2(T)\) to \(L^2({\varOmega }_f)\). Furthermore, if \(T' \in {\varDelta }({\fancyscript{T}})\) such that \(T' \subset {\varOmega }_f^e\), but \(T' \notin {\fancyscript{T}}_f\), we let \(g=f \cap T'\). Then \(g \in {\varDelta }(f)\) and \(A_{f,T}v|_{T'} = A_{g,T}v|_{T'}\) (Fig. 6).

Fig. 6
figure 6

The case when \(T' \subset {\varOmega }_f^e\), but \(T' \notin {\fancyscript{T}}_f\) (enclosed in the thick lines). Here \(g=f \cap T'\)

By utilizing the argument just given with respect to \(g\) instead of \(f\), we can conclude that \(\lambda _f^* \circ A_{f,T}\) is bounded from \(L^2(T)\) to \(L^2({\varOmega }_f^e)\). In particular, on the boundary of \({\varOmega }_f^e\), \((\lambda _f^* \circ A_{f,T})v\) is constant with value

In fact, this is also the value of \((\lambda _f^* \circ A_{f,T})v\) in \({\varOmega }\setminus {\varOmega }_f^e\), and we can therefore conclude that \(\lambda _f^* \circ A_{f,T}\) is bounded from \(L^2(T)\) to \(L^2({\varOmega })\). Since the operator \(A_f\) is a weighted sum of the operators \(A_{f,T}\), we can also conclude that \(\lambda _f^* \circ A_{f}\) is bounded from \(L^2({\varOmega }_f)\) to \(L^2({\varOmega })\).

A completely analogous argument, essentially using that differentiation commutes with averaging, also shows that \(\lambda _f^* \circ A_{f}\) is bounded from \(H^1({\varOmega }_f)\) to \(H^1({\varOmega })\). We just observe that

Here \(D{G}_m = D{G}_m({y})\) is the derivative of \({G}_m({\lambda }_f({x}),{y})\) with respect to \({x}\), given as the \(n\times n\) matrix

$$\begin{aligned} D{G}_m = \sum _{j=0}^m ({x}_j - {y})({{\text {grad}}}\lambda _j)^T, \end{aligned}$$

and this matrix is uniformly bounded with respect to \({y}\). We have therefore established Lemma 3.3 in the special case when \(I\) is the empty set.

Proof (Proof of Lemma 3.3)

We need to show that the operators \(\lambda _f^* \circ P_I^* \circ A_f\) are bounded from \(L^2({\varOmega }_f)\) to \(L^2({\varOmega })\) and from \(H^1({\varOmega }_f)\) to \(H^1({\varOmega })\) for all \(I \in {\fancyscript{I}}_m\). As in the discussion above, it is sufficient to consider each of the operators \(\lambda _f^* \circ P_I^* \circ A_{f,T}\) for all \(T \in {\fancyscript{T}}_f\). However, the operator \(\lambda _f^* \circ P_I^* \circ A_{f,T}\) is equal to \(\lambda _g^* \circ A_{g,T}\), where \(g= f(I)= \{ {x}\in f \, |\, {P}_I{\lambda }_f({x}) = {\lambda }_f({x}) \, \}\), and as a consequence, the desired result follows from the discussion above. \(\square \)

Proof (Proof of Lemma 3.4)

Since the function \(\rho _f\) is identically equal to one outside \({\varOmega }_f^e\) and the operator \(\lambda _f^* \circ A_f\) is bounded in \(L^2\), it is enough to show that

$$\begin{aligned} \int _{{\varOmega }_f^e} \rho _f^{-2}({x})|v({x}) - A_fv({\lambda }_f({x}))|^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f^e}^2, \quad v \in H^1({\varOmega }). \end{aligned}$$

Furthermore, it is enough to show the corresponding result for each of the operators \(A_{f,T}\), i.e., to show that

$$\begin{aligned} \int _{{\varOmega }_f^e} \rho _f^{-2}({x})|v({x}) - A_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f^e}^2, \quad v \in H^1({\varOmega }), \end{aligned}$$
(5.9)

for all \(T \in {\fancyscript{T}}_f\). In fact, it will actually be enough to show that

$$\begin{aligned} \int _{{\varOmega }_f} \rho _f^{-2}({x})|v({x}) - A_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}^2, \quad v \in H^1({\varOmega }). \end{aligned}$$
(5.10)

To see this, assume that (5.10) has been established. If \(T' \in {\fancyscript{T}}\), such that \(T' \subset {\varOmega }_f^e\), but \(T' \notin {\fancyscript{T}}_f\), we let \(g = f \cap T'\). On \(T'\) we then have \(\rho _f = \rho _g\), \(({\lambda }_f)_i = ({\lambda }_g)_i\) if \({x}_i \in g\), and \(({\lambda }_f)_i = 0\) otherwise. In particular, \(A_{f,T}v = A_{g,T}v\) on \(T'\). From (5.10), applied to \(g\) instead of \(f\), we then obtain

$$\begin{aligned} \int _{T'}\rho _f({x})^{-2}|v({x}) - A_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}&\le \int _{{\varOmega }_g}\rho _g({x})^{-2}|v({x}) - A_{g,T}v({\lambda }_g({x}))|^2 \, \hbox {d}{x}\\&\le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_g}^2. \end{aligned}$$

By combining this with (5.10), we obtain (5.9).

The rest of the proof is devoted to establishing the bound (5.10). We start by introducing a new averaging operator \(\tilde{A}_{f,T}\) by

where the second equality follows from (5.5) and the fact that \(J(f,q)\) is constant for \(q \in f^*(T)\). We will estimate the two terms

$$\begin{aligned}&\int _{{\varOmega }_f} \rho _f^{-2}({x})|v({x}) - \tilde{A}_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}, \ \int _{{\varOmega }_f} \rho _f^{-2}({x})|\tilde{A}_{f,T}v({\lambda }_f({x}))\\&\quad - A_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}. \end{aligned}$$

If \(m= n-1\) and \(T \in {\fancyscript{T}}_f\), then \(f^*(T)\) is just a single vertex and \(\tilde{A}_{f,T}v({\lambda }_f({x})) = v({x})\) for \({x}\in T\). Furthermore, if \(f\) is on the boundary of \({\varOmega }\), then \({\varOmega }_f = T\), so in this case, the estimate for \(v - \tilde{A}_{f,T}v({\lambda }_f(\cdot ))\) is trivial. If \(m= n-1\) and \(f\) is an interior simplex, then \({\fancyscript{T}}_f\) consists of two simplexes, say \(T\) and \(T_-\). For \(x \in T_-\), we have

$$\begin{aligned} \tilde{A}_{f,T}v({\lambda }_f({x})) - v({x}) = v({G}_m({\lambda }_f({x}), {q})) - v({x}) = v({x}+ \rho _f({x}) ({q}- {q}_-)) - v({x}), \end{aligned}$$

where \({q}\) and \({q}_-\) are the single vertices in \(f^*(T)\) and \(f^*(T_-)\), respectively. Let

$$\begin{aligned} \hat{x}= {G}_m({\lambda }_f({x}), {x}_f) = {x}+ \rho _f({x}) ({x}_f - {q}_-) \in f, \end{aligned}$$

where \({x}_f\) is the barycenter of \(f\). We will utilize a piecewise linear path from \({x}\in T_-\) to \({G}_m({\lambda }_f({x}), {q}) = \hat{x}+ \rho _f({x})({q}- {x}_f) \in T\) via the point \(\hat{x}\in f\). We then obtain

$$\begin{aligned} \rho _f({x})^{-1}(\tilde{A}_{f,T}v({\lambda }_f({x})) - v({x}))&= \int _0^1 [{{\text {grad}}}v({x}'(t)) \cdot ({q}- {x}_f) \\&+\, {{\text {grad}}}v({x}'_-(t)) \cdot ({x}_f - {q}_-)] \, \hbox {d}t, \end{aligned}$$

where the curve \({x}'(t) \equiv \hat{x}+ t \rho _f({x}) ({q}- {x}_f)\) is in \(T\), while the curve \({x}'_-(t) \equiv {x}+ t \rho _f({x}) ({x}_f - {q}_-)\) is in \(T_-\). From Minkowski’s inequality, we obtain

$$\begin{aligned}&\Big ( \int _{{\varOmega }_f}\rho _f({x})^{-2}|v({x}) - \tilde{A}_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}\Big )^{1/2}\\&\quad \le c \, \int _0^1 \Big ( \int _{T_-} |{{\text {grad}}}v({x}'(t))|^2 \, \hbox {d}{x}+ \int _{T_-} |{{\text {grad}}}v({x}'_-(t))|^2 \, \hbox {d}{x}\Big )^{1/2}\, \hbox {d}t. \end{aligned}$$

In the first integral with respect to \({x}\) above, we make the substitution \({x}\mapsto {x}'\). The matrix associated with this transformation is \(I + ({x}_f - {q}_-+ t ({q}- {x}_f) )({{\text {grad}}}\rho _f ({x}))^T\) with determinant

$$\begin{aligned}&1 + {{\text {grad}}}\rho _f ({x}) \cdot [({x}_f - {q}_-) + t ({q}- {x}_f)]\\&\quad = 1 + [\rho _f({x}_f) - \rho _f({q}_-)] + t \, {{\text {grad}}}\rho _f ({x}) \cdot ({q}- {x}_f) = t \delta , \end{aligned}$$

where \(\delta = {{\text {grad}}}\rho _f ({x}) \cdot ({q}- {x}_f)\) for \({x}\in T_-\). Since \({x}\) and \({q}\) are on the opposite sides of \(f\), \(\delta < 0\), and we obtain

$$\begin{aligned} \int _{T_-} |{{\text {grad}}}v({x}'(t))|^2 \, \hbox {d}{x}\le (|\delta | \, t)^{-1} \Vert {{\text {grad}}}v \Vert _{0,T}^2. \end{aligned}$$

We use a similar approach for the second \({x}\) integral above, where we use the substitution \({x}\mapsto {x}'_-\). The associated matrix is \(I + t ({x}_f - {q}_-)({{\text {grad}}}\rho _f ({x}))^T\) with determinant

$$\begin{aligned} 1 + t {{\text {grad}}}\rho _f ({x})) \cdot ({x}_f - {q}_-) = 1 + t [\rho _f({x}_f) - \rho _f({q}_-)] = 1-t. \end{aligned}$$

Arguing as above we obtain

$$\begin{aligned} \int _{T_-} |{{\text {grad}}}v({x}'_-(t))|^2 \, \hbox {d}{x}\le (1 - t)^{-1} \Vert {{\text {grad}}}v \Vert _{0,T_-}^2 \end{aligned}$$

Using these facts, we then obtain

$$\begin{aligned}&\Big ( \int _{{\varOmega }_f} \rho _f({x})^{-2}|v({x}) - \tilde{A}_{f,T}v ({\lambda }_f({x}))|^2 \, \hbox {d}{x}\Big )^{1/2}\nonumber \\&\quad \le c \int _0^1 [ (|\delta | \, t)^{-1} + (1 - t)^{-1} ] ^{1/2}\, \hbox {d}t \, \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f} \nonumber \\&\quad \le c_1 \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}. \end{aligned}$$
(5.11)

This is the desired bound for \(v - \tilde{A}_{f,T}v({\lambda }_f(\cdot ))\) when \(\dim f = m=n-1\).

If \(m < n-1\), we will utilize the fact that then \(f^*\) is connected. As observed above, this is easily seen if \(f\) is an internal simplex, while the case of boundary simplexes is treated in Lemma 5.1. From (5.4) and (5.6), we obtain

$$\begin{aligned}&\int _{{\varOmega }_f}\rho _f({x})^{-2}|v({x}) - \tilde{A}_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}\nonumber \\&\quad \le c \int _{{\fancyscript{S}}_m^c}b({\lambda })^{n-m-3}\int _{f^*} |v({G}_m({\lambda },{q})) - \tilde{A}_{f,T}v({\lambda })|^2 \, \hbox {d}{q}\, \hbox {d}{\lambda }. \end{aligned}$$
(5.12)

However, the interior integral above admits the estimate

$$\begin{aligned} \int _{f^*} |v({G}_m({\lambda },{q})) - \tilde{A}_{f,T}v({\lambda })|^2 \, \hbox {d}{q}\le c b({\lambda })^2 \Vert {{\text {grad}}}v({G}_m({\lambda },{\cdot })) \Vert _{0,f^*}^2. \end{aligned}$$
(5.13)

To see this, observe that

only depends on the restriction of \(v\) to \(f^*\), is bounded in \(L^2(f^*(T))\), and reproduces constants on \(f^*\). By the connectivity of \(f^*\) and Poincaré’s inequality, we therefore can conclude that

$$\begin{aligned} \int _{f^*} |v({q}) - \tilde{A}_{f,T}v({0})|^2 \, \hbox {d}{q}\le c \Vert {{\text {grad}}}v \Vert _{0,f^*}^2, \end{aligned}$$
(5.14)

for all functions \(v \in H^1(f^*)\). The estimate (5.13) now follows by a scaling argument. For a fixed \({\lambda }\in {\fancyscript{S}}_m^c\), introduce the function \(\hat{v}\) defined on \(f^*\) by

$$\begin{aligned} \hat{v}({q}) = v({G}_m({\lambda },{q})) \quad \text {with } {{\text {grad}}}\hat{v}({q}) = b({\lambda }){{\text {grad}}}v({G}_m({\lambda },{q})). \end{aligned}$$

Then \(\tilde{A}_{f,T}\hat{v}({0}) = \tilde{A}_{f,T}v({\lambda })\), and therefore, the estimate (5.13) follows directly from (5.14) applied to \(\hat{v}\). Furthermore, by (5.12), (5.13), and (5.1), we obtain

$$\begin{aligned}&\int _{{\varOmega }_f}\rho _f({x})^{-2}|v({x}) - \tilde{A}_{f,T}v({\lambda }_f({x}))|^2 \, \hbox {d}{x}\nonumber \\&\quad \le c \int _{{\fancyscript{S}}_m^c}b({\lambda })^{n-m-1}\int _{f^*}|{{\text {grad}}}v({G}_m({\lambda },{q}))|^2 \, \hbox {d}{q}\, \hbox {d}{\lambda }\nonumber \\&\quad \le c_1 \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}^2, \end{aligned}$$
(5.15)

for all \(v \in H^1({\varOmega }_f)\). Together with the estimate (5.11), we have therefore established the desired estimate for \(v - \tilde{A}_{f,T}v({\lambda }_f(\cdot ))\) for all \(f \in {\varDelta }({\fancyscript{T}})\) with \(\dim f \le n\) \(-1\). To complete the proof, we need a corresponding estimate for \(\tilde{A}_{f,T}v({\lambda }_f({\cdot }))\) \( - A_{f,T}v({\lambda }_f({\cdot }))\). For any \({\lambda }\in {\fancyscript{S}}_m^c\), we have

However, writing

$$\begin{aligned} {y}= \sum _{j=0}^m \lambda _j({y}) {x}_j + \rho _f({y}) {q}_f({y}), \end{aligned}$$

it is easy to check that

$$\begin{aligned} {G}_m({\lambda },(1-t){q}_f({y})+ t {y}) = {G}_m({\lambda }', {q}_f({y})), \end{aligned}$$

where \({\lambda }' = {\lambda }'({\lambda }, t, {\lambda }_f({y}))\) and

$$\begin{aligned} {\lambda }'({\lambda },t,{\mu }) = {\lambda }+ t b({\lambda }){\mu }, \quad {\lambda }, {\mu }\in {\fancyscript{S}}_m^c, \, t \in \mathbb {R}. \end{aligned}$$

Therefore, we can use (5.5) to rewrite the representation of \(\tilde{A}_{f,T}v({\lambda }) - A_{f}v({\lambda })\) in the form

$$\begin{aligned}&\tilde{A}_{f,T}v({\lambda }) - A_{f}v({\lambda }) = \frac{b({\lambda })}{|T|}\\&\quad \cdot \int _0^1 \int _{{\fancyscript{S}}_m^c} b({\mu })^{n-m-1}\int _{f^*(T)} {{\text {grad}}}v({G}_m({\lambda }'({\lambda }, t, {\mu }), {q})) \cdot ({y}- {q}) J(f,{q})\, \hbox {d}{q}\, \hbox {d}{\mu }\, \hbox {d}t, \end{aligned}$$

where \({y}= {G}_m({\mu },{q})\). Hence, it follows by Minkowski’s inequality and (5.6) that

$$\begin{aligned}&\Big (\int _{{\varOmega }_f} \rho _f^{-2}({x})[\tilde{A}_{f,T}v({\lambda }({x}))- A_{f,T}v({\lambda }({x}))]^2 \, \hbox {d}{x}\Big )^{1/2}\\&\quad \le c \int _0^1\int _{{\fancyscript{S}}_m^c} {b({\mu })}^{n-m-1} \Big ( \int _{{\varOmega }_f} \int _{f^*} |{{\text {grad}}}v({G}_m({\lambda }' ({\lambda }_f({x}),t, {\mu }),{q}))|^2 \hbox {d}{q}\, \hbox {d}{x}\Big )^{1/2} \hbox {d}{\mu }\, \hbox {d}t \\&\quad \le c \int _0^1 \int _{{\fancyscript{S}}_m^c} {b({\mu })}^{n-m-1} \Big ( \int _{{\fancyscript{S}}_m^c} {b({\lambda })}^{n-m-1} \int _{f^*} |{{\text {grad}}}v({G}_m({\lambda }',{q}))|^2 \hbox {d}{q}\, \hbox {d}{\lambda }\Big )^{1/2} \hbox {d}{\mu }\, \hbox {d}t, \end{aligned}$$

where \({\lambda }' = {\lambda }'({\lambda },t, {\mu })\). To proceed, we make the substitution \({\lambda }\mapsto {\lambda }'\). The matrix associated with this transformation is \(I - t {\mu }{e}^T\), with determinant \(b(t {\mu })\). Here, as above, \({e}\) is the vector with all components equal to one. Furthermore, \(b({\lambda }') = b({\lambda })b(t{\mu })\). Since \(b(t{\mu }) \ge b({\mu })\), it follows, again using (5.1) and (5.6), that

$$\begin{aligned}&\Big (\int _{{\varOmega }_f} \rho _f^{-2}({x})[\tilde{A}_{f,T}v({\lambda }_f({x}))- A_{f,T}v({\lambda }_f({x}))]^2 \, \hbox {d}{x}\Big )^{1/2}\\&\quad \le c \int _0^1\int _{{\fancyscript{S}}_m^c} \frac{{b({\mu })}^{n-m-1}}{{b(t{\mu })}^{(n-m)/2}} \Big ( \int _{{\fancyscript{S}}_m^c} {b({\lambda }')}^{n-m-1}\int _{f^*} |{{\text {grad}}}v({G}_m({\lambda }',{q}))|^2 dq \, \hbox {d}{\lambda }'\Big )^{1/2} \hbox {d}{\mu }\, \hbox {d}t\\&\quad \le c \int _{{\fancyscript{S}}_m^c}{b({\mu })}^{-1 + (n-m)/2}\Big ( \int _{{\fancyscript{S}}_m^c} {b({\lambda }')}^{n-m-1}\int _{f^*} |{{\text {grad}}}v({G}_m({\lambda }',{q}))|^2 \hbox {d}{q}\, \hbox {d}{\lambda }'\Big )^{1/2} \hbox {d}{\mu }\\&\quad \le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}\int _{{\fancyscript{S}}_m^c}{b({\mu })}^{-1 +(n-m)/2} \, \hbox {d}{\mu }\le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}. \end{aligned}$$

Together with (5.11) and (5.15), this completes the proof of (5.10), and hence, the lemma is established. \(\square \)

Proof (Proof of Lemma 3.5)

For \(f \in {\varDelta }_m({\fancyscript{T}})\) and \(I \in {\fancyscript{I}}_m\), with \(m<n\), we have to show

$$\begin{aligned} \int _{{\varOmega }}\rho _g^{-2}({x}) |A_fv({P}_I{\lambda }_f({x}))|^2 \, \hbox {d}{x}\le c\, [\int _{{\varOmega }}\rho _g^{-2}({x}) |v({x})|^2 \, \hbox {d}{x}+ \Vert {{\text {grad}}}v \Vert _{0}^2], \end{aligned}$$

where \(g = f(I) \in {\varDelta }(f)\). We observe that

$$\begin{aligned} A_fv({P}_I{\lambda }_f) = \sum _{T \in {\fancyscript{T}}_f} \frac{|T|}{|{\varOmega }_f|}A_{g,T}({\lambda }_g). \end{aligned}$$

However, by (5.9), we have

$$\begin{aligned} \int _{{\varOmega }} \rho _g^{-2}({x})|v({x}) - A_{g,T} v({\lambda }_g({x}))|^2\, \hbox {d}{x}\le c\, \Vert v \Vert _1^2, \end{aligned}$$

and by the triangle inequality this implies that

$$\begin{aligned} \int _{{\varOmega }}\rho _g^{-2}({x}) |A_{g,T}v({\lambda }_g({x}))|^2 \, \hbox {d}{x}\le c\, \left[ \int _{{\varOmega }}\rho _g^{-2}({x}) |v({x})|^2 \, \hbox {d}{x}+ \Vert {{\text {grad}}}v \Vert _{0}^2\right] . \end{aligned}$$

The desired result follows by summing over \(T \in {\fancyscript{T}}_f\). \(\square \)

Proof (Proof of Lemma 3.6)

Let \(m<n\), \(f= [{x}_0,{x}_1, \ldots {x}_m] \in {\varDelta }_m({\fancyscript{T}})\), \(I \in {\fancyscript{I}}_m\) with \(0 \notin I\) and \(I' = (0,I)\). We must show that

$$\begin{aligned} \int _{{\varOmega }_{x_0}} {\lambda }_0^{-2}({x})[A_fv({P}_I{\lambda }_f({x})) - A_fv(P_{I'}{\lambda }_f({x}))]^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,{\varOmega }_f}^2, \quad v \in H^1({\varOmega }_f). \end{aligned}$$

We recall that for any \(T \in {\fancyscript{T}}_f\), we have \(A_{f,T}v({P}_I{\lambda }_f({\cdot })) = A_{g,T}v({\lambda }_g({\cdot }))\), where \(g = f(I) \in {\varDelta }(f)\). Similarly, \(A_{f,T}v({P}_I'{\lambda }_f({\cdot })) = A_{g,T}v({P}{\lambda }_g({\cdot }))\), where \(({P}{\lambda }_g)_0 = 0\), and \(({P}{\lambda }_g)_i = ({\lambda }_g)_i\) for \(i \ne 0\). The desired estimate will follow if we can show

$$\begin{aligned} \int _{{\varOmega }_{x_0}} \lambda _0^{-2}({x})[A_{g,T}v({\lambda }_g({x})) - A_{g,T}v({P}{\lambda }_{g}({x}))]^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,T}^2, \end{aligned}$$
(5.16)

for all \(v \in H^1(T)\), \(T \in {\fancyscript{T}}_f\). In fact, it is enough to show that

$$\begin{aligned} \int _{{\varOmega }_{g}} \lambda _0^{-2}({x})[A_{g,T}v({\lambda }_g({x})) - A_{g,T}v({P}{\lambda }_{g}({x}))]^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,T}^2. \end{aligned}$$
(5.17)

To see this, assume that \(\hat{T} \in {\fancyscript{T}}_{x_0}\) such that \(\hat{T} \notin {\fancyscript{T}}_g\). Let \(\hat{g} = g \cap \hat{T}\). Then \(\hat{T} \in {\fancyscript{T}}_{\hat{g}}\), and \(({\lambda }_{\hat{g}})_i = ({\lambda }_g)_i\) for all the components of \({\lambda }_g\) which are not identically zero on \(\hat{T}\). Therefore, (5.17), applied to \(\hat{g}\) instead of \(g\), will imply that

$$\begin{aligned} \int _{\hat{T}} \lambda _0^{-2}({x})[A_{g,T}v({\lambda }_g({x})) - A_{g,T}v({P}{\lambda }_{g}({x}))]^2 \, \hbox {d}{x}\le c \Vert {{\text {grad}}}v \Vert _{0,T}^2. \end{aligned}$$

By carrying out this process for all possible \(\hat{T} \in {\varOmega }_{x_0} \setminus {\varOmega }_g\) and combining it with (5.17), we obtain (5.16).

The rest of the proof is devoted to establish (5.17). Without loss of generality, we can assume that \(g = [{x}_0, {x}_1, \ldots ,{x}_j]\) such that

We have

where \({\lambda }= {\lambda }_g \in {\fancyscript{S}}_j^c\). If we express \({y}\) as \({y}= {G}_j({\mu },{q})\), where \({\mu }= {\lambda }_g({y})\) and \({q}= {q}_g({y})\), we further obtain that

$$\begin{aligned} {G}_j({\lambda },{y}) + t\lambda _0({y}-{x}_0)&= \sum _{i=0}^j\lambda _i{x}_i + [t\lambda _0 + b({\lambda })]{y}- t\lambda _0{x}_0\\&= \sum _{i=0}^j\lambda _i{x}_i + [t\lambda _0 +b({\lambda })] \Big [\sum _{i=0}^j{\mu }_i{x}_i + b({\mu }){q}\Big ] - t\lambda _0{x}_0\\&= \sum _{i=0}^j\lambda '_i{x}_i + b({\lambda }'){q}= {G}_j({\lambda }',{q}), \end{aligned}$$

where \({\lambda }' = {\lambda }'({\lambda },t,{\mu })\) is given by

$$\begin{aligned} \lambda '_0 = (1-t)\lambda _0 + [t\lambda _0 +b({\lambda })]\mu _0 \end{aligned}$$

and where

$$\begin{aligned} \lambda '_i = \lambda _i + [t\lambda _0 + b({\lambda })]\mu _i, \quad i>0. \end{aligned}$$

Using the identity (5.5), we therefore have

$$\begin{aligned}&A_{g,T}v({P}{\lambda }_{g}) - A_{g,T}v({\lambda }_g)\\&\quad = \frac{\lambda _0}{|T|} \int _{{\fancyscript{S}}_j^c}b({\mu })^{n-j-1} \int _0^1 \int _{g^*(T)}{{\text {grad}}}v({G}_j({\lambda }',{q})) \cdot ({G}_j({\mu },{q})- {x}_0) \, \hbox {d}{q}\, \hbox {d}t \, \hbox {d}{\mu },\\ \end{aligned}$$

where \({\lambda }' = {\lambda }'({\lambda },t,{\mu })\) and \({\lambda }= {\lambda }_g\). We note that

$$\begin{aligned} b({\lambda }') = b({\lambda })b({\mu }) + t\lambda _0 b({\mu }) \ge b({\lambda }) b({\mu }). \end{aligned}$$

Using this, we have from Minkowski’s inequality and (5.5) that

$$\begin{aligned}&\Big (\int _{{\varOmega }_g} \lambda _0^{-2}({x})|A_{g,T}v({P}{\lambda }_{g}({x})) - A_{g,T}v ({\lambda }_g({x}))|^2 \, \hbox {d}{x}\Big )^{1/2} \\&\quad \le c \int _{{\fancyscript{S}}_j^c}b({\mu })^{n-j-1} \int _0^1 \Big (\int _{{\varOmega }_g}\int _{g^*(T)}|{{\text {grad}}}v({G}_j({\lambda }'({x}),{q}))|^2 \, \hbox {d}{q}\, \hbox {d}{x}\Big )^{1/2}\hbox {d}t \, \hbox {d}{\mu }\\&\quad \le c \int _{{\fancyscript{S}}_j^c}{b({\mu })}^{(n-j-1)/2} \int _0^1 \Big (\int _{{\fancyscript{S}}_j^c}{b({\lambda }')}^{n-j-1} \int _{g^*(T)} |{{\text {grad}}}v({G}_j({\lambda }',{q}))|^2 \, \hbox {d}{q}\, \hbox {d}{\lambda }\Big )^{1/2}\hbox {d}t \, \hbox {d}{\mu }, \end{aligned}$$

where \({\lambda }' = {\lambda }'({\lambda },t,{\mu })\) is given above, and \({\lambda }'({x})= {\lambda }'({\lambda }_g({x}),t,{\mu })\). To complete the argument, we make the substitution \({\lambda }\mapsto {\lambda }'\). The matrix associated with this transformation is given by

$$\begin{aligned} I - {\mu }{e}^T + t({\mu }- {e}_0){e}_0^T = (I - {\mu }{e}^T)(I - t{e}_0{e}_0^T), \end{aligned}$$

with determinant \((1-t)b({\mu })\), where \({e}_0\) denotes the vector with first component \(1\) and all other components equal to \(0\). Therefore, we obtain

$$\begin{aligned}&\Big (\int _{{\varOmega }_g} \lambda _0^{-2}({x})|A_{g,T}v({P}{\lambda }_{g}({x})) - A_{g,T}v({\lambda }_g({x}))|^2 \, \hbox {d}{x}\Big )^{1/2} \\&\quad \le c \int _{{\fancyscript{S}}_j^c} \int _0^1 \frac{{b({\mu })}^{-1+ (n-j)/2}}{(1-t)^{1/2}} \Big (\int _{{\fancyscript{S}}_j^c} {b({\lambda }')}^{n-j-1} \int _{g^*(T)} |{{\text {grad}}}v({G}_j({\lambda }',{q}))|^2 \, \hbox {d}{q}\, \hbox {d}{\lambda }'\Big )^{1/2}\hbox {d}t \, \hbox {d}{\mu }\\&\quad \le c \Big (\int _T | {{\text {grad}}}v({x})|^2 \, \hbox {d}{x}\Big )^{1/2}, \end{aligned}$$

where (5.1) and (5.5) have been used for the final inequality. This completes the proof of (5.17) and hence of the lemma. \(\square \)