1 Introduction

Given a map \(u:M^n\rightarrow N^k\) between smooth Riemannian manifolds of dimension n and k, there is a natural concept of energy associated to u. The minimizers, or more general critical points of such an energy functional, are called harmonic maps. If \(n=2\), the regularity of energy minimizing harmonic maps was established by Morrey [42]. For energy minimizing harmonic maps defined on a higher dimensional Riemannian manifold, a well-known regularity theory has been developed by Schoen and Uhlenbeck [51]. In particular, in the case where the target space \(N^k\) has non-positive sectional curvature, it has been proved that any energy minimizing harmonic map is smooth (see also [20]). However, without any restriction on the target space \(N^k\), an energy minimizing map might not be even continuous.

1.1 Harmonic maps between singular spaces and Hölder continuity

Gromov and Schoen [17] initiated to study the theory of harmonic maps into singular spaces, motivated by the p-adic superrigidity for lattices in groups of rank one. Consider a map \(u:M\rightarrow Y\). If Y is not a smooth manifold, the energy of u can not be defined via its differential. A natural idea is to consider an energy concept as a limit of suitable difference quotients. The following concept of approximating energy for maps between metric spaces was introduced by Korevaar and Schoen [33].

Let \((M,d_M)\), \((Y,d_Y)\) be two metric spaces and let \(\Omega \) be a domain of M, equipped with a Radon measure \(\mathrm{vol}\) on M. Given \(p\geqslant 1\), \(\epsilon >0\) and a Borel measurable map \(u:\Omega \rightarrow Y\), an approximating energy functional \(E^u_{p,\epsilon }\) is defined on \(C_0(\Omega )\), the set of continuous functions compactly supported in \(\Omega \), as follows:

$$\begin{aligned} E^u_{p,\epsilon }(\phi ):= c(n,p)\int _\Omega \phi (x)\int _{B_x(\epsilon )\cap \Omega }\frac{d^p_Y(u(x),u(y))}{\epsilon ^{n+p}}d{\mathrm{vol}}(y)d{\mathrm{vol}}(x) \end{aligned}$$

where \(\phi \in C_0(\Omega )\) and c(np) is a normalized constant.

In the case where \(\Omega \) is a domain of a smooth Riemannian manifold and Y is an arbitrary metric space, Korevaar and Schoen [33] proved that \(E^u_{p,\epsilon }(\phi )\) converges weakly, as a linear functional on \(C_0(\Omega )\), to some (energy) functional \(E^u_p(\phi )\). The same convergence has been established for the case where \(\Omega \) is replaced with one of the following:

  • a domain of a Lipschitz manifold (by Gregori [16]);

  • a domain of a Rimannian polyhedron (for \(p=2\), by Eells and Fuglede [11]);

  • a domain of a singular space with certain condition, including Alexandrov spaces with curvature bounded from below, abbreviated by CBB for short (by Kuwae and Shioya [37]).

When \(p=2\), minimizing maps, in the sense of calculus of variations, of such an energy functional \(E^u_2(\phi )\) are called harmonic maps.

Sturm [55] studied a generalization of the theory of harmonic maps between singular spaces via an approach of probabilistic theory.

The purpose of this paper is to study the regularity theory of harmonic maps from a domain of an Alexandrov space with CBB into a complete length space of non-positive curvature in the sense of Alexandrov, abbreviated by NPC for short. This problem was initiated by Lin [39] and Jost [26,27,28], independently. They established the following Hölder regularity.

Theorem 1.1

(Lin [39], JostFootnote 1 [27]) Let \(\Omega \) be a bounded domain in an Alexandrov space with CBB, and let \((Y,d_Y)\) be an NPC space. Then any harmonic map \(u:\Omega \rightarrow Y\) is locally Hölder continuous in \(\Omega \).

The Hölder regularity of harmonic maps between singular spaces or into singular spaces has been also studied by many other authors. For example, Chen [7], Eells and Fuglede [11, 13, 14], Ishizuka and Wang [22] and Daskalopoulos and Mese [8, 10], and others.

1.2 Lipschitz continuity and main result

Lin [39] proposed an open problem: whether the Hölder continuity in the above Theorem 1.1 can be improved to Lipschitz continuity? Precisely,

Conjecture 1.2

(Lin [39]) Let \(\Omega , Y\) and u be as in Theorem 1.1. Is u locally Lipschitz continuous in \(\Omega \)?

Jost also asked a similar problem about Lipschitz regularity of harmonic maps between singular spaces (see page 38 in [28]). The Lipschitz continuity of harmonic maps is the key in establishing rigidity theorems of geometric group theory in [8, 9, 17].

Up to now, there are only a few answers for some special cases.

The first is the case where the target space \(Y={\mathbb {R}}\), i.e., the theory of harmonic functions. The Lipschitz regularity of harmonic functions on singular spaces has been obtained under one of the following two assumptions: (i) \(\Omega \) is a domain of a metric space, which supports a doubling measure, a Poincaré inequality and a certain heat kernel condition [23, 34]; (ii) \(\Omega \) is a domain of an Alexandrov space with CBB [49, 50, 58]. Nevertheless, these proofs depend heavily on the linearity of the Laplacian on such spaces.

It is known from [6] that the Hölder continuity always holds for any harmonic function on a metric measure space \((M,d,\mu )\) with a standard assumption: the measure \(\mu \) is doubling and M supports a Poincaré inequality (see, for example, [6]). However, in [34], a counterexample was given to show that such a standard assumption is not sufficient to guarantee the Lipschitz continuity of harmonic functions.

The second is the case where \(\Omega \) is a domain of some smooth Riemannian manifold and Y is an NPC space. Korevaar and Schoen [33] established the following Lipschitz regularity for any harmonic map from \(\Omega \) to Y.

Theorem 1.3

(Korevaar–Schoen [33]) Let \(\Omega \) be a bounded domain of a smooth Riemannian manifold M, and let \((Y,d_Y)\) be an NPC metric space. Then any harmonic map \(u:\Omega \rightarrow Y\) is locally Lipschitz continuous in \(\Omega \).

However, their Lipschitz constant in the above theorem depends on the \(C^1\)-norm of the metric \((g_{ij})\) of the smooth manifold M. In Section 6 of [26], Jost described a new argument for the above Korevaar–Schoen’s Lipschitz regularity using intersection properties of balls. The Lipschitz constant given by Jost depends on the upper and lower bounds of Ricci curvature on M. This does not seem to suggest a Lipschitz regularity of harmonic maps from a singular space.

The major obstacle to prove a Lipschtz continuity of harmonic maps from a singular space can be understood as follows. For the convenience of the discussion, we consider a harmonic map \(u: (\Omega ,g)\rightarrow N\) from a domain \(\Omega \subset {\mathbb {R}}^n\) with a singular Riemanian metric \( g=(g_{ij}) \) into a smooth non-positively curved manifold N, which by the Nash embedding theorem is isometrically embedded in some Euclidean space \({\mathbb {R}}^K\). Then u is a solution of the nonlinear elliptic system of divergence form

$$\begin{aligned} \frac{1}{\sqrt{g}}\partial _i\left( \sqrt{g}g^{ij}\partial _j u_\alpha \right) +g^{ij}A^\alpha \big (\partial _iu,\partial _ju\big )=0,\quad \alpha =1,\ldots , K \end{aligned}$$
(1.1)

in the sense of distribution, where \(g=\det (g_{ij})\), \((g^{ij})\) is the inverse matrix of \((g_{ij})\), and \( A^\alpha \) is the second fundamental form of N. It is well-known that, as a second order elliptic system, the regularity of solutions is determined by regularity of its coefficients. If the coefficients \(\sqrt{g}g^{ij}\) are merely bounded measurable, Shi [54] proved that the solution u is Hölder continuous. But, a harmonic map might fail to be Lipschitz continuous, even with assumption that the coefficients are continuous. See [25] for a counterexample for this.

The above Lin’s conjecture is about the Lipschitz continuity for harmonic maps between Alexandrov spaces. Consider M to be an Alexandrov space with CBB and let \(p\in M\) be a regular point. According to [43, 45], there is a coordinate neighborhood \(U\ni p\) and a corresponding \(BV_{\mathrm{loc}}\)-Riemannian metric \((g_{ij})\) on U. Hence, the coefficients \(\sqrt{g}g^{ij}\) of elliptic system (1.1) are measurable on U. However, it is well-known [43] that they may not be continuous on a dense subset of U for general Alexandrov spaces with CBB. Thus, it is apparent that the above Lin’s conjecture might not be true.

Our main result in this paper is the following affirmative resolution to the above Lin’s problem, Conjecture 1.2.

Theorem 1.4

Let \(\Omega \) be a bounded domain in an n-dimensional Alexandrov space \((M,|\cdot ,\cdot |)\) with curvature \(\geqslant k\) for some constant \(k\leqslant 0\), and let \((Y,d_Y)\) be an NPC space (not necessary locally compact). Assume that \(u:\Omega \rightarrow Y\) is a harmonic map. Then, for any ball \(B_q(R)\) with \(B_q(2R)\subset \Omega \) and \(R\leqslant 1\), there exists a constant C(nkR), depending only on nk and R, such that

$$\begin{aligned} \frac{d_Y\big (u(x),u(y)\big )}{|xy|}\leqslant C(n,k,R)\cdot \bigg ( \Big (\frac{ E^u_2\big (B_q(R)\big )}{{\mathrm{vol}}\big (B_q(R)\big )}\Big )^{1/2}+\mathrm{osc}_{\overline{B_q(R)}}u\bigg ) \end{aligned}$$

for all \(x,y\in B_q(R/16),\) where \(E^u_2(B_q(R))\) is the energy of u on \(B_q(R)\).

Remark 1.5

A curvature condition on domain space is necessary. Indeed, Chen [7] constructed a harmonic function u on a two-dimensional metric cone M such that u is not Lipschitz continuous if M has no a lower curvature bound.

1.3 Organization of the paper

The paper is composed of six sections. In Sect. 2, we will provide some necessary materials on Alexandrov spaces. In Sect. 3, we will recall basic analytic results on Alexandrov spaces, including Sobolev spaces, super-solutions of Poisson equations in the sense of distribution and super-harmonicity in the sense of Perron. In Sect. 4, we will review the concepts of energy and approximating energy, and then we will prove a point-wise convergence result for their densities. In Sect. 5, we will recall some basic results on existence and Hölder regularity of harmonic map into NPC spaces. We will then give an estimate for point-wise Lipschitz constants of such a harmonic map. The Sect. 6 is devoted to the proof of the main Theorem 1.4.

2 Preliminaries

2.1 Basic concepts on Alexandrov spaces with curvature \(\geqslant k\)

Let \(k\in {\mathbb {R}}\) and \(l\in {\mathbb {N}}\). Denote by \({\mathbb {M}}^l_k\) the simply connected, l-dimensional space form of constant sectional curvature k. The space \({\mathbb {M}}^2_k\) is called k-plane.

Let \((M,|\cdot \cdot \ |)\) be a complete metric space. A rectifiable curve \(\gamma \) connecting two points pq is called a geodesic if its length is equal to |pq| and it has unit speed. A metric space M is called a geodesic space if, for every pair points \(p,q\in M\), there exists some geodesic connecting them.

Fix any \(k\in {\mathbb {R}}\). Given three points pqr in a geodesic space M, we can take a triangle \(\triangle {\overline{p}}{\overline{q}}{\overline{r}}\) in k-plane \({\mathbb {M}}^2_k\) such that \(|{\overline{p}}{\overline{q}}|=|pq|\), \(|{\overline{q}}{\overline{r}}|=|qr|\) and \(|{\overline{r}}{\overline{p}}|=|rp|\). If \(k>0\), we add the assumption \(|pq|+|qr|+|rp|<2\pi /\sqrt{k}\). The triangle \(\triangle {\overline{p}}{\overline{q}}{\overline{r}}\subset {\mathbb {M}}^2_k\) is unique up to a rigid motion. We let \(\widetilde{\angle }_k pqr\) denote the angle at the vertex \({\overline{q}}\) of the triangle \(\triangle {\overline{p}}{\overline{q}}{\overline{r}}\), and we call it a k-comparison angle.

Definition 2.1

Let \(k\in {\mathbb {R}}\). A geodesic space M is called an Alexandrov space with curvature \(\geqslant k\) if it satisfies the following properties:

  1. (i)

    it is locally compact;

  2. (ii)

    for any point \(x\in M\), there exists a neighborhood U of x such that the following condition is satisfied: for any two geodesics \(\gamma (t)\subset U\) and \(\sigma (s)\subset U\) with \(\gamma (0)=\sigma (0):=p\), the k-comparison angles

    $$\begin{aligned} \widetilde{\angle }_\kappa \gamma (t)p\sigma (s) \end{aligned}$$

    is non-increasing with respect to each of the variables t and s.

It is well known that the Hausdorff dimension of an Alexandrov space with curvature \(\geqslant k\), for some constant \(k\in {\mathbb {R}}\), is always an integer or \(+\infty \) (see, for example, [4] or [5]). In the following, the terminology of “an (n-dimensional) Alexandrov space M” means that M is an Alexandrov space with curvature \(\geqslant k\) for some \(k\in {\mathbb {R}}\) (and that its Hausdorff dimension \(=n\)). We denote by \({\mathrm{vol}}\) the n-dimensional Hausdorff measure on M.

On an n-dimensional Alexandrov space M, the angle between any two geodesics \(\gamma (t)\) and \(\sigma (s)\) with \(\gamma (0)=\sigma (0):=p\) is well defined, as the limit

$$\begin{aligned} \angle \gamma '(0)\sigma '(0):=\lim _{s,t\rightarrow 0}\widetilde{\angle }_\kappa \gamma (t)p\sigma (s). \end{aligned}$$

We denote by \(\Sigma '_p\) the set of equivalence classes of geodesic \(\gamma (t)\) with \(\gamma (0)=p\), where \(\gamma (t)\) is equivalent to \(\sigma (s)\) if \(\angle \gamma '(0)\sigma '(0)=0\). \((\Sigma _p',\angle )\) is a metric space, and its completion is called the space of directions at p, denoted by \(\Sigma _p\). It is known (see, for example, [4] or [5]) that \((\Sigma _p,\angle )\) is an Alexandrov space with curvature \(\geqslant 1\) of dimension \(n-1\). It is also known (see, for example, [4] or [5]) that the tangent cone at p, \(T_p\), is the Euclidean cone over \(\Sigma _p\). Furthermore, \(T_p^k\) is the k-cone over \(\Sigma _p\) (see page 355 in [4]). For two tangent vectors \(u,v\in T_p\), their “scalar product” is defined by (see Section 1 in [48])

$$\begin{aligned} \left<{u},{v}\right>:=\frac{1}{2} \left( |u|^2 + |v|^2- |uv|^2\right) . \end{aligned}$$

Let \(p\in M\). Given a direction \(\xi \in \Sigma _p\), we remark that there does possibly not exists geodesic \(\gamma (t)\) starting at p with \(\gamma '(0)=\xi \).

We refer to the seminar paper [5] or the text book [4] for the details.

Definition 2.2

(Boundary, [5]) The boundary of an Alexandrov space M is defined inductively with respect to dimension. If the dimension of M is one, then M is a complete Riemannian manifold and the boundary of M is defined as usual. Suppose that the dimension of M is \(n\geqslant 2\). A point p is a boundary point of M if \(\Sigma _p\) has non-empty boundary.

$$\begin{aligned} \textit{From now on, we always consider Alexandrov spaces without boundary.} \end{aligned}$$

2.2 The exponential map and second variation of arc-length

Let M be an n-dimensional Alexandrov space and \(p\in M\). For each point \(x\not =p\), the symbol \(\uparrow _p^x\) denotes the direction at p corresponding to some geodesic px. Denote by [43]

$$\begin{aligned} W_p:=\big \{x\in M\backslash \{p\} \big |\ \mathrm{geodesic}\ px\ \mathrm{can\ be\ extended\ beyond}\ x\big \}. \end{aligned}$$

According to [43], the set \(W_p\) has full measure in M. For each \(x\in W_p\), the direction \(\uparrow _p^x\) is uniquely determined, since any geodesic in M does not branch [5]. Recall that the map \(\log _p: W_p\rightarrow T_p\) is defined by \(\log _p(x):=|px|\cdot \uparrow _p^x\) (see [48]). It is one-to-one from \(W_p\) to its image

$$\begin{aligned} {\mathscr {W}}_p:=\log _p(W_p)\subset T_p. \end{aligned}$$

The inverse map of \(\log _p\),

$$\begin{aligned} \exp _p=(\log _p)^{-1}: {\mathscr {W}}_p\rightarrow W_p, \end{aligned}$$

is called the exponential map at p.

One of the technical difficulties in Alexandrov geometry comes from the fact that \({\mathscr {W}}_p\) may not contain any neighbourhood of the vertex of the cone \(T_p\).

If M has curvature \(\geqslant k\) on \(B_p(R)\), then exponential map

$$\begin{aligned} \exp _p: B_o(R)\cap {\mathscr {W}}_p\subset T^k_p\rightarrow M \end{aligned}$$

is a non-expending map [5], where \(T^k_p\) is the k-cone over \(\Sigma _p\) and o is the vertex of \(T_p\).

In [46], A. Petrunin established the notion of parallel transportation and second variation of arc-length on Alexandrov spaces.

Proposition 2.3

(Petrunin, Theorem 1.1. B in [46]) Let \(k\in {\mathbb {R}}\) and let M be an n-dimensional Alexandrov space with curvature \(\geqslant k\). Suppose that points p and q such that the geodesic pq can be extended beyond both p and q.

Then, for any fixed sequence \(\{\epsilon _j\}_{j\in {\mathbb {N}}}\) going to 0, there exists an isometry \(T:T_p\rightarrow T_q\) and a subsequence \(\{\varepsilon _j\}_{j\in {\mathbb {N}}}\subset \{\epsilon _j\}_{j\in {\mathbb {N}}}\) such that

$$\begin{aligned} \big |\exp _{p}(\varepsilon _j\cdot \eta )\ \exp _{q}(\varepsilon _j\cdot T\eta )\big |\leqslant |pq|-\frac{k\cdot |pq|}{2}|\eta |^2\cdot \varepsilon ^2_j+o\left( \varepsilon _j^2\right) \quad \end{aligned}$$
(2.1)

for any \(\eta \in T_p\) such that the left-hand side is well-defined.

Here and in the following, we denote by \(g(s)=o(s^\ell )\) if the function g(s) satisfies \(\lim _{s\rightarrow 0^+}\frac{g(s)}{s^\ell }=0.\)

2.3 Singularity, regular points, smooth points and \(C^\infty \)-Riemannian approximations

Let \(k\in {\mathbb {R}}\) and let M be an n-dimensional Alexandrov space with curvature \(\geqslant k\). For any \(\delta >0\), we denote

$$\begin{aligned} M^\delta :=\big \{x\in M:\ {\mathrm{vol}}(\Sigma _x)>(1-\delta )\cdot {\mathrm{vol}}({\mathbb {S}}^{n-1})\big \}, \end{aligned}$$

where \({\mathbb {S}}^{n-1}\) is the standard \((n-1)\)-sphere. This is an open set (see [5]). The set \(S_\delta := M\backslash M^\delta \) is called the \(\delta \)-singular set. Each point \(p\in S_\delta \) is called a \(\delta \)-singular point. The set

$$\begin{aligned} S_M:=\cup _{\delta >0}S_\delta \end{aligned}$$

is called singular set. A point \(p\in M\) is called a singular point if \(p\in S_M\). Otherwise it is called a regular point. Equivalently, a point p is regular if and only if \(T_p\) is isometric to \({\mathbb {R}}^n\) [5]. At a regular point p, we have that \(T^k_p\) is isometric \({\mathbb {M}}_k^n.\) Since we always assume that the boundary of M is empty, it is proved in [5] that the Hausdorff dimension of \(S_M\) is \(\leqslant n-2.\) We remark that the singular set \(S_M\) might be dense in M [43].

Some basic structures of Alexandrov spaces have been known in the following.

Fact 2.4

Let \(k\in {\mathbb {R}}\) and let M be an n-dimensional Alexandrov space with curvature \(\geqslant k\).

  1. 1.

    There exists a constant \(\delta _{n,k}>0\) depending only on the dimension n and k such that for each \(\delta \in (0,\delta _{n,k})\), the set \( M^{\delta }\) forms a Lipschitz manifold [5] and has a \(C^\infty \)-differentiable structure [36].

  2. 2.

    There exists a \(BV_{\mathrm{loc}}\)-Riemannian metric g on \( M^\delta \) such that

    • the metric g is continuous in \(M\backslash S_M\) [43, 45];

    • the distance function on \(M\backslash S_M\) induced from g coincides with the original one of M [43];

    • the Riemannian measure on \(M\backslash S_M\) induced from g coincides with the Hausdorff measure of M [43].

A point p is called a smooth point if it is regular and there exists a coordinate system \((U,\phi )\) around p such that

$$\begin{aligned} |g_{ij}(\phi (x))-\delta _{ij}|=o(|px|), \end{aligned}$$
(2.2)

where \((g_{ij})\) is the corresponding Riemannian metric in the above Fact 2.4 (2) near p and \((\delta _{ij})\) is the identity \(n\times n\) matrix.

It is shown in [45] that the set of smooth points has full measure. The following asymptotic behavior of \( W_p\) around a smooth point p is proved in [58].

Lemma 2.5

(Lemma 2.1 in [58]) Let \(p\in M\) be a smooth point. We have

$$\begin{aligned} \Big |\frac{d{\mathrm{vol}}(x)}{dH^n(v)}-1\Big |=o(r),\qquad \forall \ x\in W_p\cap B_p(r),\quad v=\log _p(x) \end{aligned}$$

and

$$\begin{aligned} \frac{{H^n} \big (B_o(r)\cap {\mathscr {W}}_p\big )}{{H^n}\big (B_o(r)\big )}\geqslant 1-o(r). \end{aligned}$$
(2.3)

where \(B_o(r)\subset T_p\) and \(H^n\) is n-dimensional Hausdorff measure on \(T_p\ (\overset{\mathrm{isom}}{\approx }{\mathbb {R}}^n)\).

The following property on smooth approximation is contained in the proof of Theorem 6.1 in [36]. For the convenience, we state it as a lemma.

Lemma 2.6

(Kuwae–Machigashira–Shioya [36], \(C^\infty \)-approximation). Let \(k\in {\mathbb {R}}\) and let M be an n-dimensional Alexandrov space with curvature \(\geqslant k\). The constant \(\delta _{n,k}\) is given in the above Fact 2.4 (1).

Let \(0<\delta <\delta _{n,k}\). For any compact set \(C\subset M^\delta \), there exists an neighborhood U of C with \(U\subset M^\delta \) and a \(C^\infty \)-Riemannian metric \(g_\delta \) on U such that the distance \(d_\delta \) on U induced from \(g_\delta \) satisfies

$$\begin{aligned} \bigg |\frac{d_\delta (x,y)}{|xy|}-1 \bigg |<\kappa (\delta ) \qquad \mathrm{for\ any}\ x,y\in U, x\not =y, \end{aligned}$$
(2.4)

where \(\kappa (\delta )\) is a positive function (depending only on \(\delta \)) with \(\lim _{\delta \rightarrow 0}\kappa (\delta )=0.\)

Proof

In the first paragraph of the proof of Theorem 6.1 in [36] (see page 294), the authors constructed a \(\kappa (\delta )\)-almost isometric homeomorphism F from an neighborhood U of C to some \(C^\infty \)-Riemannian manifold N with distance function \(d_N\). That is, the map \(F:U\rightarrow N\) is a bi-Lipschitz homeomorphism satisfying

$$\begin{aligned} \bigg |\frac{d_N(F(x),F(y))}{|xy|}-1\bigg |<\kappa (\delta ) \qquad \mathrm{for\ any}\ x,y\in U, x\not =y. \end{aligned}$$

Now let us consider the distance function \(d_\delta \) on U defined by

$$\begin{aligned} d_\delta (x,y):=d_N\big (F(x),F(y)\big ). \end{aligned}$$

The map \(F: (U,d_\delta )\rightarrow (N,d_N)\) is an isometry, and hence the desired \(C^\infty \)-Riemannian metric \(g_\delta \) can be defined by the pull-back of the Riemanian metric \(g_N\). \(\square \)

2.4 Semi-concave functions and Perelman’s concave functions

Let M be an Alexandrov space without boundary and \(\Omega \subset M \) be an open set. A locally Lipschitz function \(f: \Omega \rightarrow {\mathbb {R}}\) is called to be \(\lambda \)-concave [48] if for all geodesics \(\gamma (t)\) in \( \Omega \), the function

$$\begin{aligned} f\circ \gamma (t)-\lambda \cdot t^2/2 \end{aligned}$$

is concave. A function \(f: \Omega \rightarrow {\mathbb {R}}\) is called to be semi-concave if for any \(x\in \Omega \), there exists a neighborhood of \(U_x\ni x\) and a number \(\lambda _x\in {\mathbb {R}}\) such that \(f|_{U_x}\) is \(\lambda _x\)-concave. (see Section 1 in [48] for the basic properties of semi-concave functions).

Proposition 2.7

(Perelman’s concave function, [29, 44]) Let \(p\in M\). There exists a constant \(r_1>0\) and a function \(h: B_p(r_1)\rightarrow {\mathbb {R}}\) satisfying:

  1. (i)

    h is \((-1)\)–concave;

  2. (ii)

    h is 2-Lipschitz, that is, h is Lipschitz continuous with a Lipschitz constant 2.

We refer the reader to [58] for the further properties for Perelman’s concave functions.

3 Analysis on Alexandrov spaces

In this section, we will summarize some basic analytic results on Alexandrov spaces, including Sobolev spaces, Laplacian and harmonicity via Perron’s method.

3.1 Sobolev spaces on Alexandrov spaces

Several different notions of Sobolev spaces on metric spaces have been established, see [6, 19, 33, 36, 37, 53].Footnote 2 They coincide with each other on Alexandrov spaces.

Let M be an n-dimensional Alexandrov space with curvature \(\geqslant k\) for some \(k\in {\mathbb {R}}\). It is well-known (see [36] or the survey [57]) that the metric measure space \((M,| \cdot \cdot \ |,{\mathrm{vol}})\) is locally doubling and supports a local (weak) \(L^2\)-Poincaré inequality. Moreover, given a bounded domain \(\Omega \subset M\), both the doubling constant \(C_d\) and the Poincaré constant \(C_P\) on \(\Omega \) depend only on nk and \(\mathrm{diam}(\Omega )\).

Let \(\Omega \) be an open domain in M. Given \(f\in C(\Omega )\) and point \(x\in \Omega \), the pointwise Lipschitz constant [6] of f at x is defined by:

$$\begin{aligned} \mathrm{Lip}f(x):=\limsup _{y\rightarrow x}\frac{|f(x)-f(y)|}{|xy|}. \end{aligned}$$

We denote by \(Lip_{\mathrm{loc}}(\Omega )\) the set of locally Lipschitz continuous functions on \(\Omega \), and by \(Lip_0(\Omega )\) the set of Lipschitz continuous functions on \(\Omega \) with compact support in \(\Omega .\) For any \(1\leqslant p\leqslant +\infty \) and \(f\in Lip_{\mathrm{loc}}(\Omega )\), its \(W^{1,p}(\Omega )\)-norm is defined by

$$\begin{aligned} \Vert f\Vert _{W^{1,p}(\Omega )}:=\Vert f\Vert _{L^{p}(\Omega )}+\Vert \mathrm{Lip}f\Vert _{L^{p}(\Omega )}. \end{aligned}$$

The Sobolev space \(W^{1,p}(\Omega )\) is defined by the closure of the set

$$\begin{aligned} \{f\in Lip_{\mathrm{loc}}(\Omega )|\ \Vert f\Vert _{W^{1,p}(\Omega )}<+\infty \}, \end{aligned}$$

under \(W^{1,p}(\Omega )\)-norm. The space \(W_0^{1,p}(\Omega )\) is defined by the closure of \(Lip_0(\Omega )\) under \(W^{1,p}(\Omega )\)-norm (this coincides with the definition in [6], see Theorem 4.24 in [6]). We say a function \(f\in W^{1,p}_{\mathrm{loc}}(\Omega )\) if \(f\in W^{1,p}(\Omega ')\) for every open subset \(\Omega '\subset \subset \Omega .\) Here and in the following, “\(\Omega '\subset \subset \Omega \)” means \(\Omega '\) is compactly contained in \(\Omega \). In Theorem 4.48 of [6], Cheeger proved that \(W^{1,p}(\Omega )\) is reflexible for any \(1<p<\infty .\)

3.2 Laplacian and super-solutions

Let us recall a concept of Laplacian [47, 58] on Alexandrov spaces, as a functional acting on the space of Lipschitz functions with compact support.

Let M be an n-dimensional Alexandrov space and \(\Omega \) be a bounded domain in M. Given a function \(f\in W_{\mathrm{loc}}^{1,2}(\Omega )\), we define a functional \({\mathscr {L}}_f\) on \(Lip_0(\Omega )\), called the Laplacian functional of f, by

$$\begin{aligned} {\mathscr {L}}_f(\phi ):=-\int _\Omega \left<{\nabla f},{\nabla \phi }\right>d\mathrm{vol},\qquad \forall \phi \in Lip_0(\Omega ). \end{aligned}$$

When a function f is \(\lambda \)-concave, Petrunin in [47] proved that \({\mathscr {L}}_f\) is a signed Radon measure. Furthermore, if we write its Lebesgue decomposition as

$$\begin{aligned} {\mathscr {L}}_f=\Delta f\cdot \mathrm{vol}+\Delta ^s f, \end{aligned}$$
(3.1)

then

$$\begin{aligned} \Delta ^sf\leqslant 0\quad \hbox { and }\quad \Delta f\cdot {\mathrm{vol}}\leqslant n\cdot \lambda \cdot {\mathrm{vol}}. \end{aligned}$$

Let \(h\in L^1_{\mathrm{loc}}(\Omega ) \) and \(f\in W_{\mathrm{loc}}^{1,2}(\Omega )\). The function f is said to be a super-solution (sub-solution, resp.) of the Poisson equation

$$\begin{aligned} {\mathscr {L}}_f=h\cdot \mathrm{vol}, \end{aligned}$$

if the functional \({\mathscr {L}}_f\) satisfies

$$\begin{aligned} {\mathscr {L}}_f(\phi )\leqslant \int _\Omega h\phi d\mathrm{vol}\qquad \Big (\mathrm{or}\quad {\mathscr {L}}_f(\phi )\geqslant \int _\Omega h\phi d\mathrm{vol}\Big ) \end{aligned}$$

for all nonnegative \(\phi \in Lip_0(\Omega )\). In this case, according to the Theorem 2.1.7 of [21], the functional \({\mathscr {L}}_f\) is a signed Radon measure.

Equivalently, \(f\in W_{\mathrm{loc}}^{1,2}(\Omega )\) is sub-solution of \({\mathscr {L}}_f=h\cdot \mathrm{vol}\) if and only if it is a local minimizer of the energy

$$\begin{aligned} {\mathcal {E}}(v)=\int _{\Omega '}\big (|\nabla v|^2+2hv\big )d\mathrm{vol} \end{aligned}$$

in the set of functions v such that \(f\geqslant v\) and \(f-v\) is in \(W^{1,2}_0(\Omega ')\) for every fixed \(\Omega '\subset \subset \Omega .\) It is known (see for example [35]) that every continuous super-solution of \({\mathscr {L}}_f=0\) on \(\Omega \) satisfies Maximum Principle, which states that

$$\begin{aligned} \min _{x\in \Omega '}f\geqslant \min _{x\in \partial \Omega '}f \end{aligned}$$

for any open set \(\Omega '\subset \subset \Omega .\)

A function f is a (weak) solution (in the sense of distribution) of Poisson equation \({\mathscr {L}}_f=h\cdot \mathrm{vol}\) on \(\Omega \) if it is both a sub-solution and a super-solution of the equation. In particular, a (weak) solution of \({\mathscr {L}}_f=0\) is called a harmonic function.

Now remark that f is a (weak) solution of Poisson equation \({\mathscr {L}}_f=h\cdot \mathrm{vol}\) if and only if \({\mathscr {L}}_f\) is a signed Radon measure and its Lebesgue’s decomposition \({\mathscr {L}}_f=\Delta f\cdot {\mathrm{vol}}+\Delta ^sf\) satisfies

$$\begin{aligned} \Delta f=h\qquad \mathrm{and}\qquad \Delta ^sf=0. \end{aligned}$$

Given a function \(h\in L^2(\Omega )\) and \(g\in W^{1,2}(\Omega )\), we can solve the Dirichlet problem of the equation

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathscr {L}}_f&{}=h\cdot \mathrm{vol}\\ f&{}=g|_{\partial \Omega }. \end{array}\right. } \end{aligned}$$

Indeed, by the Sobolev embedding theorem (see [18, 36]) and a standard argument (see, for example, [15]), it is known that the solution of the Dirichlet problem exists uniquely in \(W^{1,2}(\Omega )\) (see, for example, Theorem 7.12 and Theorem 7.14 in [6]). Furthermore, if we add the assumption \(h\in L^s\) with \(s>n/2\), then the solution f is locally Hölder continuous in \(\Omega \) (see [31, 36]).

Lemma 3.1

Let \(\Omega \) be a bounded domain of an Alexandrov space. Assume that \(g\in L^\infty (\Omega )\). If \(f\in W^{1,2}(\Omega )\) is a weak solution of the Poisson equation

$$\begin{aligned} {\mathscr {L}}_f=g\cdot {\mathrm{vol}}. \end{aligned}$$

Then f is locally Lipschitz continuous in \(\Omega \).

Proof

In [24, Theorem 3.1], it has been shown that Yau’s gradient estimate for harmonic functions implies that the local Lipschitz continuity for solutions of \({\mathscr {L}}_f=g\cdot {\mathrm{vol}}.\) On the other hand, Yau’s gradient estimate for harmonic functions has been established in [58] (see also [23]). \(\square \)

The following mean value inequality is a slight extension of Corollary 4.5 in [58].

Proposition 3.2

Let M be an n-dimensional Alexandrov space and \(\Omega \) be a bounded domain in M. Assume function \(h\in L^1_{\mathrm{loc}}(\Omega )\) with \(h(x)\leqslant C\) for some constant C. Suppose that \(f\in W^{1,2}_{\mathrm{loc}}(\Omega )\cap C(\Omega )\) is nonnegative and satisfies that

$$\begin{aligned} {\mathscr {L}}_f\leqslant h\cdot {\mathrm{vol}}. \end{aligned}$$

If \(p\in \Omega \) is a Lebesgue point of h, then

$$\begin{aligned} \frac{1}{H^{n-1}\left( \partial B_o(R)\subset T^k_p\right) }\int _{\partial B_p(R)}f(x)d{\mathrm{vol}}\leqslant f(p)+\frac{h(p)}{2n}\cdot R^{2}+o(R^{2}). \end{aligned}$$

Proof

The same assertion has been proved under the added assumption that \(h\in L^\infty \) in Corollary 4.5 in [58]. Here, we will use an approximated argument.

For each \(j\in {\mathbb {N}}\), by setting \(h_j:=\max \{-j,h\}\), we conclude that \(h_j\in L^\infty (\Omega )\), \(h_j\) is monotonely converging to h, and

$$\begin{aligned} {\mathscr {L}}_f\leqslant h\cdot {\mathrm{vol}}\leqslant h_j\cdot {\mathrm{vol}},\qquad \forall \ j\in {\mathbb {N}}. \end{aligned}$$

For any \(p\in \Omega \), by using Proposition 4.4 in [58], we have, for all \(R>0\) with \(B_p(R)\subset \subset \Omega \) and for each \(j\in {\mathbb {N}}\),

$$\begin{aligned} \frac{1}{H^{n-1}\left( \partial B_o(R)\subset T^k_p\right) }\int _{\partial B_p(R)}fd{\mathrm{vol}}-f(p)\leqslant (n-2)\cdot \frac{\omega _{n-1}}{{\mathrm{vol}}(\Sigma _p)}\cdot \varrho _j(R), \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} \varrho _j(R)&=\int _{B^*_p(R)}Gh_jd\mathrm{vol}-\phi _k(R)\int _{B_p(R)}h_jd\mathrm{vol}, \end{aligned} \end{aligned}$$

where \(B^*_p(R)=B_p(R)\backslash \{p\}\), the function \(G(x):=\phi _k(|px|)\) and \(\phi _k(r)\) is the real value function such that \(\phi \circ dist_o\) is the Green function on \({\mathbb {M}}^n_k\) with singular point o. That is, if \(n\geqslant 3\),

$$\begin{aligned} \phi _k(r)=\frac{1}{(n-2)\cdot \omega _{n-1}}\int _r^{\infty }s^{1-n}_k(t)dt, \end{aligned}$$

and

$$\begin{aligned} s_k(t)= {\left\{ \begin{array}{ll} \sin \left( \sqrt{k}t\right) /\sqrt{k}&{} \quad k>0\\ t&{}\quad k=0\\ \sinh \left( \sqrt{- k}t\right) /\sqrt{- k}&{} \quad k<0. \end{array}\right. } \end{aligned}$$

Here, \(\omega _{n-1}\) is the volume of \((n-1)\)-sphere \({\mathbb {S}}^{n-1}\) with standard metric. If \(n=2\), the function \(\phi _k\) can be given similarly.

Letting \(j\rightarrow \infty \) and applying the monotone convergence theorem, we get

$$\begin{aligned} \frac{1}{H^{n-1}\left( \partial B_o(R)\subset T^k_p\right) }\int _{\partial B_p(R)}fd{\mathrm{vol}}-f(p)\leqslant (n-2)\cdot \frac{\omega _{n-1}}{{\mathrm{vol}}(\Sigma _p)}\cdot \varrho (R),\nonumber \\ \end{aligned}$$
(3.2)

where

$$\begin{aligned} \begin{aligned} \varrho (R)&=\int _{B^*_p(R)}Ghd\mathrm{vol}-\phi _k(R)\int _{B_p(R)}hd\mathrm{vol}. \end{aligned} \end{aligned}$$

Letting p be a Lebesgue point of h, it is calculated in [58] that (see from line 6 to line 14 on page 470 of [58]),

$$\begin{aligned} \varrho (R)=\frac{{\mathrm{vol}}(\Sigma _p)}{2n(n-2)\omega _{n-1}}h(p)\cdot R^2+o(R^2). \end{aligned}$$

Therefore, the desired result follows from this and Eq. (3.2). \(\square \)

3.3 Harmonicity via Perron’s method

The Perron’s method has been studied in [1, 30] in the setting of measure metric spaces. We follow Kinnunen–Martio,Footnote 3 Section 7 of [30], to defined the super-harmonicity.

Definition 3.3

Let \(\Omega \) be an open subset of an Alexandrov space. A function \(f:\Omega \rightarrow (-\infty ,\infty ]\) is called super-harmonic on \(\Omega \) if it satisfies the following properties:

  1. (i)

    f is lower semi-continuous in \(\Omega \);

  2. (ii)

    f is not identically \(\infty \) in any component of \(\Omega \);

  3. (iii)

    for every domain \(\Omega '\subset \subset \Omega \) the following comparison principle holds: if \(v\in C(\overline{\Omega '})\cap W^{1,2}(\Omega ')\) and \(v\leqslant f\) on \(\partial \Omega '\), then \(h(v)\leqslant f\) in \(\Omega '\). Here h(v) is the (unique) solution of the equation \({\mathscr {L}}_{h(v)}=0\) in \(\Omega \) with \(v-h(v)\in W^{1,2}_0(\Omega ')\).

A function f is sub-harmonic on \(\Omega \), if \(-f\) is super-harmonic on \(\Omega \).

For our purpose in this paper, we will focus on the case where \(\Omega \) is a bounded domain and the function \(f\in C(\Omega )\cap W^{1,2}_{\mathrm{loc}}(\Omega )\). Therefore, in this case, we can simply replace the definition of super-harmonicity as follows.

Definition 3.3 \('\): Let \(\Omega \) be a bounded domain of an Alexandrov space. A function \(f\in C(\Omega )\cap W^{1,2}_{\mathrm{loc}}(\Omega )\) is called super-harmonic on \(\Omega \) if the following comparison principle holds:

(iii\('\)) for every domain \(\Omega '\subset \subset \Omega \), we have \(h(f)\leqslant f\) in \(\Omega '\).

Indeed, if \(f\in C(\Omega )\cap W^{1,2}_{\mathrm{loc}}(\Omega )\), then \(f\in C(\overline{\Omega '})\cap W^{1,2}(\Omega ')\) for any domain \(\Omega '\subset \subset \Omega \). Hence, the the condition (iii) implies (iii\('\)). The inverse follows from Maximum Principle. Indeed, given any domain \(\Omega '\subset \subset \Omega \) and any \(v\in C(\overline{\Omega '})\cap W^{1,2}(\Omega ')\) with \(v\leqslant f\) on \(\partial \Omega '\), Maximum Principle implies that \(h(v)\leqslant h(f)\) in \(\Omega '\). Consequently, the condition (iii\('\)) implies (iii).

Lemma 3.4

(Kinnunen–Martio [30]) Let \(\Omega \) be a bounded domain of an Alexandrov space. Assume that \(f\in W_{\mathrm{loc}}^{1,2}(\Omega )\cap C(\Omega )\). Then the following properties are equivalent to each other:

  1. (i)

    \(\ f\) is a super-solution of \({\mathscr {L}}_f=0\) on \(\Omega \);

  2. (ii)

    \(\ f\) is a super-harmonic function in the Definition \(3.3'\).

Proof

Let \(f\in W_{\mathrm{loc}}^{1,2}(\Omega )\). The function f is a super-solution of \({\mathscr {L}}_f=0\) on \(\Omega \) if and only if it is a superminimizer in \(\Omega \), defined by Kinnunen–Martio on page 865 of [30].

Now the equivalence between (i) and (ii) follows from the Corollaries 7.6 and 7.9 in [30]. \(\square \)

It is easy to extend the Lemma 3.4 to Poisson equations.

Corollary 3.5

Let \(\Omega \) be a bounded domain of an Alexandrov space. Assume that \(f\in W_{\mathrm{loc}}^{1,2}(\Omega )\cap C(\Omega )\) and \(g\in L^\infty (\Omega )\). Then the following properties are equivalent to each other:

  1. (i)

    \(\ f\) is a super-solution of \({\mathscr {L}}_f=g\cdot {\mathrm{vol}}\) on \(\Omega \);

  2. (ii)

    \(\ f\) satisfies the following comparison principle: for each domain \(\Omega '\subset \subset \Omega \), we have \(v\leqslant f\) in \(\Omega '\), where \(v\in W^{1,2}(\Omega ')\) is the (unique) solution of

$$\begin{aligned} {\mathscr {L}}_v=g\cdot {\mathrm{vol}}\quad \mathrm{with} \quad v-f\in W^{1,2}_0(\Omega '). \end{aligned}$$

Proof

Let w be a weak solution of \({\mathscr {L}}_w=g\cdot {\mathrm{vol}}\) on \(\Omega \) (in the sense of distribution). Then, by Lemma 3.1, we have \(w\in C(\Omega )\cap W^{1,2}_{\mathrm{loc}}(\Omega )\). We denote

$$\begin{aligned} {\tilde{f}}:=f-w\in C(\Omega )\cap W^{1,2}_{\mathrm{loc}}(\Omega ). \end{aligned}$$

Obviously, the property (i) is equivalent to that \({\tilde{f}}\) is a super-solution of \({\mathscr {L}}_{{\tilde{f}}}=0\) on \(\Omega \). On the other hand, taking any domain \(\Omega '\subset \subset \Omega \) and letting \(v\in W^{1,2}(\Omega ')\) is the (unique) solution of \({\mathscr {L}}_v=g\cdot {\mathrm{vol}}\) with \( v-f\in W^{1,2}_0(\Omega '),\) we have

$$\begin{aligned} {\mathscr {L}}_{v-w}=0\quad \mathrm{with}\quad (v-w)-{\tilde{f}}\in W^{1,2}_0(\Omega '). \end{aligned}$$

That is, \(h({\tilde{f}})=v-w.\) Hence, the property (ii) is equivalent to that \({\tilde{f}}\) is a super-harmonic function in the Definition \(3.3'\). Now the Lemma is a consequence of Lemma 3.4. \(\square \)

4 Energy functional

From now on, in this section, we always denote by \(\Omega \) a bounded open domain of an n-dimensional Alexandrov space \((M,|\cdot ,\cdot |)\) with curvature \(\geqslant k\) for some \(k\leqslant 0\), and denote by \((Y,d_Y)\) a complete metric space.

Fix any \(p\in [1,\infty )\). A Borel measurable map \(u:\ \Omega \rightarrow Y\) is said to be in the space \(L^p(\Omega ,Y)\) if it has separable range and, for some (hence, for all) \(P\in Y\),

$$\begin{aligned} \int _\Omega d^p_Y\big (u(x),P\big )d{\mathrm{vol}}(x)<\infty . \end{aligned}$$

We equip \(L^p(\Omega ,Y)\) with a distance given by

$$\begin{aligned} d^p_{L^p}(u,v):=\int _\Omega d^p_Y\big (u(x),v(x)\big )d{\mathrm{vol}}(x), \qquad \forall \ u,v\in L^p(\Omega ,Y). \end{aligned}$$

Denote by \(C_0(\Omega )\) the set of continuous functions compactly supported on \(\Omega \). Given \( p\in [1,\infty )\) and a map \(u\in L^p(\Omega ,Y),\) for each \(\epsilon >0\), the approximating energy \(E^u_{p,\epsilon }\) is defined as a functional on \(C_0(\Omega )\):

$$\begin{aligned} E^u_{p,\epsilon }(\phi ):=\int _\Omega \phi (x) e^u_{p,\epsilon }(x)d{\mathrm{vol}}(x) \end{aligned}$$

where \(\phi \in C_0(\Omega )\) and \(e^u_{p,\epsilon }\) is approximating energy density defined by

$$\begin{aligned} e^u_{p,\epsilon }(x):=\frac{n+p}{c_{n,p}\cdot \epsilon ^n}\int _{B_x(\epsilon )\cap \Omega }\frac{d^p_Y\big (u(x),u(y)\big )}{\epsilon ^p}d{\mathrm{vol}}(y), \end{aligned}$$

where the constant \(c_{n,p}=\int _{{\mathbb {S}}^{n-1}}|x^1|^p\sigma (dx),\) and \(\sigma \) is the canonical Riemannian volume on \({\mathbb {S}}^{n-1}\). In particular, \(c_{n,2}=\omega _{n-1}/n\), where \(\omega _{n-1}\) is the volume of \((n-1)\)-sphere \({\mathbb {S}}^{n-1}\) with standard metric.

Let \( p\in [1,\infty )\) and a \(u\in L^p(\Omega ,Y).\) Given any \(\phi \in C_0(\Omega )\), it is easy to check that, for any sufficiently small \(\epsilon >0 \) (for example, \(10\epsilon <d(\partial \Omega ,\mathrm{supp}\phi )\)), the approximating energy \(E^u_{p,\epsilon }(\phi )\) coincides, up to a constant, with the one defined by Kuwae and Shioya [37],Footnote 4 that is,

$$\begin{aligned} \widetilde{E}^u_{p,\epsilon }(\phi ):=\frac{n}{2\omega _{n-1}\epsilon ^n}\int _\Omega \phi (x)\int _{B_x(\epsilon )\cap \Omega } \frac{d^p_Y(u(x),u(y))}{\epsilon ^p}\cdot I_{Q(\Omega )}(x,y)d{\mathrm{vol}}(y)d{\mathrm{vol}}(x), \end{aligned}$$

where

$$\begin{aligned} Q(\Omega ):=\big \{(x,y)\in \Omega \times \Omega :\ |xy|<|\gamma _{xy},\partial \Omega |,\ \ \forall \mathrm{geodesic} \ \gamma _{xy} \ \mathrm{\ from}\ x\ \mathrm{to}\ y\big \}, \end{aligned}$$

and \(I_{Q(\Omega )}(x,y)\) is the indicator function of the set \(Q(\Omega )\). It is proved in [37] that, for each \(\phi \in C_0(\Omega )\), the limit

$$\begin{aligned} E^u_{p}(\phi ):=\lim _{\epsilon \rightarrow 0^+}E^u_{p,\epsilon }(\phi ) \end{aligned}$$

exists. The limit functional \(E^u_{p}\) is called the energy functional.

Now the \(p^{th}\) order Sobolev space from \(\Omega \) into Y is defined by

$$\begin{aligned} W^{1,p}(\Omega ,Y):={\mathscr {D}}(E^u_p):=\left\{ u\in L^p(\Omega ,Y)|\ \sup _{0\leqslant \phi \leqslant 1,\ \phi \in C_0(\Omega )}E^u_p(\phi )<\infty \right\} , \end{aligned}$$

and \(p^{th}\) order energy of u is

$$\begin{aligned} E^u_p:=\sup _{0\leqslant \phi \leqslant 1,\ \phi \in C_0(\Omega )}E^u_p(\phi ). \end{aligned}$$

In the following proposition, we will collect some results in [37].

Proposition 4.1

(Kuwae–Shioya [37]) Let \(1<p<\infty \) and \(u\in W^{1,p}(\Omega , Y)\). Then the following assertions (1)–(5) hold.

  1. 1.

    (Contraction property, Lemma 3.3 in [37]) Consider another complete metric spaces \((Z,d_Z)\) and a Lipschitz map \(\psi : Y\rightarrow Z\), we have \(\psi \circ u\in W^{1,p}(\Omega ,Z)\) and

    $$\begin{aligned} E_p^{\psi \circ u}(\phi )\leqslant {\mathbf{Lip}}^p(\psi ) E_p^u(\phi ) \end{aligned}$$

    for any \(0\leqslant \phi \in C_0(\Omega )\), where

    $$\begin{aligned} {\mathbf{Lip}}(\psi ):=\sup _{y,y'\in Y,\ y\not =y'}\frac{d_Z(\psi (y),\psi (y'))}{d_Y(y,y')}. \end{aligned}$$

    In particular, for any point \(Q\in Y\), we have \(d_Y\big (Q,u(\cdot )\big )\in W^{1,p}(\Omega , {\mathbb {R}})\) and

    $$\begin{aligned} E_p^{d_Y(Q,u(\cdot ))}(\phi )\leqslant E_p^u(\phi ) \end{aligned}$$

    for any \(0\leqslant \phi \in C_0(\Omega )\).

  2. 2.

    (Lower semi-continuity, Theorem 3.2 in [37]) For any sequence \(u_j\rightarrow u\) in \(L^p(\Omega , Y)\) as \(j\rightarrow \infty \), we have

    $$\begin{aligned} E^u_p(\phi )\leqslant \liminf _{j\rightarrow \infty } E^{u_j}_p(\phi ) \end{aligned}$$

    for any \(0\leqslant \phi \in C_0(\Omega ).\)

  3. 3.

    (Energy measure, Theorem 4.1 and Proposition 4.1 in [37]) There exists a finite Borel measure, denoted by \(E^u_p\) again, on \(\Omega \), is called energy measure of u, such that for any \(0\leqslant \phi \in C_0(\Omega )\)

    $$\begin{aligned} E^u_p(\phi )= \int _\Omega \phi (x)dE^u_p(x). \end{aligned}$$

    Furthermore, the measure is strongly local. That is, for any nonempty open subset \(O\subset \Omega \), we have \(u|_O\in W^{1,p}(O,Y)\), and moreover, if u is a constant map almost everywhere on O, then \(E^u_p(O)=0.\)

  4. 4.

    (Weak Poincaré inequality, Theorem 4.2(ii) in [37]) For any open set \(O=B_q(R)\) with \(B_q(6R)\subset \subset \Omega \), there exists postive constant \(C=C(n,k,R)\) such that the following holds: for any \(z\in O\) and any \(0<r<R/2\), we have

    $$\begin{aligned} \int _{B_z(r)}\int _{B_z(r)}d_Y^p\big (u(x),u(y)\big )d{\mathrm{vol}}(x)d{\mathrm{vol}}(y)\leqslant Cr^{n+2}\cdot \int _{B_z(6r)}dE^u_p(x), \end{aligned}$$

    where the constant C given on page 61 of [37] depends only on the constants R, \(\vartheta ,\) and \(\Theta \) in the Definition 2.1 for WMCPBG condition in [37]. In particular, for the case of Alexandrov spaces as shown in the proof of Theorem 2.1 in [37], one can choose \(R>0\) arbitrarily, \(\vartheta =1\) and \(\Theta =\sup _{0<r<R}\frac{{\mathrm{vol}}(B_o(r)\subset {\mathbb {M}}^n_k)}{{\mathrm{vol}}(B_o(r)\subset {\mathbb {R}}^n)}=C(n,k,R)\).

  5. 5.

    (Equivalence for \(Y={\mathbb {R}}\), Theorem 6.2 in [37]) If \(Y={\mathbb {R}}\), the above Sobolev space \(W^{1,p}(\Omega ,{\mathbb {R}})\) is equivalent to the Sobolev space \(W^{1,p}(\Omega )\) given in previous Sect. 3. To be precise: For any \(u\in W^{1,p}(\Omega ,{\mathbb {R}})\), the energy measure of u is absolutely continuous with respect to \({\mathrm{vol}}\) and

    $$\begin{aligned} \frac{dE^u_p}{d{\mathrm{vol}}}(x)= |\nabla u(x)|^p. \end{aligned}$$

Remark 4.2

It is not clear whether the energy measure of \(u\in W^{1,p}(\Omega ,Y)\) is absolutely continuous with respect to the Hausdorff measure \({\mathrm{vol}}\) on \(\Omega \). If \(\Omega \) is a domain in a Lipschitz Riemannian manifold, the absolute continuity has been proved by G. Gregori in [16] (see also Korevaar–Schoen [33] for the case where \(\Omega \) is a domain in a \(C^2\) Riemannian manifold).

Let \(p>1\) and let u be a map with \(u\in W^{1,p}(\Omega ,Y)\) with energy measure \(E^u_p\). Fix any sufficiently small positive number \(\delta \) with \(0<\delta <\delta _{n,k}\), with \(\delta _{n,k}\) as in Fact 2.4 in Sect. 2.3. Then the set

$$\begin{aligned} \Omega ^\delta :=\Omega \cap M^\delta :=\big \{x\in \Omega :\ {\mathrm{vol}}(\Sigma _x)>(1-\delta ){\mathrm{vol}}({\mathbb {S}}^{n-1})\big \} \end{aligned}$$

is an open subset in \(\Omega \) and forms a Lipschitz manifold. Since the singular set of M has (Hausdorff) codimension at least two [5], we have \({\mathrm{vol}}(\Omega \backslash \Omega ^\delta )=0.\) Hence, by the strongly local property of the measure \(E^u_p\), we have \(u\in W^{1,p}(\Omega ^\delta , Y)\) and its energy measure is \(E^u_p|_{\Omega ^\delta }\). Since \(\Omega ^\delta \) is a Lipschitz manifold, according to Gregori in [16], we obtain that the energy measure \(E^u_p|_{\Omega ^\delta }\) is absolutely continuous with respect to \({\mathrm{vol}}\). Denote its density by \(|\nabla u|_p\) (we write \(|\nabla u|_p\) instead of \(|\nabla u|^p\) because the quantity p does not in general behave like power, see [33]). Considering the Lebesgue decomposition of \(E^u_p\) with respect to \({\mathrm{vol}}\) on \(\Omega \),

$$\begin{aligned} E^u_p=|\nabla u|_p\cdot {\mathrm{vol}}+\left( E^u_p\right) ^s, \end{aligned}$$

we have that the support of the singular part \((E^u_p)^s\) is contained in \(\Omega \backslash \Omega ^\delta .\)

Clearly, the energy density \(|\nabla u|_p\) is the weak limit (limit as measures) of the approximating energy density \(e^u_{p,\epsilon }\) as \(\epsilon \rightarrow 0\) on \(\Omega ^\delta \). We now show that \(e^u_{p,\epsilon }\) converges almost to \(|\nabla u|_p\) in \(L^1_{\mathrm{loc}}(\Omega )\) in the following sense.

Lemma 4.3

Let \(p>1\) and \(u\in W^{1,p}(\Omega ,Y)\). Fix any sufficiently small \(\delta >0\) with \(0<\delta <\delta _{n,k}\), with \(\delta _{n,k}\) as in Fact 2.4 in Sect. 2.3. Then, for any open subset \(B\subset \subset \Omega ^\delta \), there exists a constant \({\overline{\epsilon }}={\overline{\epsilon }}(\delta ,B)\) such that, for any \(0<\epsilon <{\overline{\epsilon }}(\delta ,B)\), we have

$$\begin{aligned} \int _B \big |e^u_{p,\epsilon }(x)-|\nabla u|_p(x)\big |d{\mathrm{vol}}(x)\leqslant {\overline{\kappa }}(\delta ), \end{aligned}$$

where \({\overline{\kappa }}(\delta )\) is a positive function (depending only on \(\delta \)) with \(\lim _{\delta \rightarrow 0}{\overline{\kappa }}(\delta )=0\).

Proof

Fix any sufficiently small \(\delta >0\) and any open set B as in the assumption. By applying Lemma 2.6, there exists some neighborhood \(U_\delta \supset {\overline{B}}\) and a smooth Riemannian metric \(g_\delta \) on \(U_\delta \) such that the distance \(d_\delta \) on \(U_\delta \) induced from \(g_\delta \) satisfies

$$\begin{aligned} \bigg |\frac{d_\delta (x,y)}{|xy|}-1\bigg |\leqslant \kappa _1(\delta ) \quad \mathrm{for\ any}\ \ x,y\in U_\delta ,\ x\not =y, \end{aligned}$$

where \(\kappa _1(\delta )\) is a positive function (depending only on \(\delta \)) with \(\lim _{\delta \rightarrow 0}\kappa _1(\delta )=0.\) This implies that

$$\begin{aligned} B^\delta _x\big ( r\cdot (1-\kappa _1(\delta ))\big )\subset B_x(r)\subset B^\delta _x\big (r\cdot (1+\kappa _1(\delta )\big ) \end{aligned}$$
(4.1)

for any \(x\in U_\delta \) and \(r>0\) with the ball \(B^\delta _x\big ((1+\kappa _1(\delta )r\big )\subset U_\delta \) and

$$\begin{aligned} 1-\kappa ^n_1(\delta )\leqslant \frac{d{\mathrm{vol}}_\delta (x)}{d{\mathrm{vol}}(x)}\leqslant 1+\kappa ^n_1(\delta )\qquad \forall \ x\in U_\delta , \end{aligned}$$
(4.2)

where \(B_x^\delta (r)\) is the geodesic balls with center x and radius r with respect to the metric \(g_\delta \), and \({\mathrm{vol}}_{\delta }\) is the n-dimensional Riemannian volume on \(U_\delta \) induced from metric \(g_\delta \).

(i). Uniformly approximated by smooth metric \(g_\delta \).

For any \(\epsilon >0\), we write the energy density and approximating energy density of u by \(|\nabla u|_{p,g_\delta }\) and \(e^u_{p,\epsilon ,g_\delta }\) on \((U_\delta ,g_\delta )\) with respect to the smooth Riemannian metric \(g_\delta .\)

Sublemma 4.4

We have, for any \(x\in U_\delta \) and any \(\epsilon >0\) with \(B_x(10\epsilon )\subset U_\delta \),

$$\begin{aligned} \big |e^u_{p,\epsilon }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |&\leqslant \kappa _4(\delta ) \cdot e^u_{p,2\epsilon }(x)+\big |e^u_{p,\epsilon (1+\kappa _1(\delta )),g_\delta }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big | \nonumber \\&\quad +\big |e^u_{p,\epsilon ,g_\delta }(x)-e^u_{p,\epsilon (1-\kappa _1(\delta )),g_\delta }(x)\big |, \end{aligned}$$
(4.3)

where \(\kappa _4(\delta )\) is a positive function (depending only on \(\delta \)) with \(\lim _{\delta \rightarrow 0}\kappa _4(\delta )=0.\)

Proof

For each \(x\in U_\delta \) and \(\epsilon >0\) with \(B_x(10\epsilon )\subset U_\delta \), by applying Eqs. (4.1)–(4.2) and setting

$$\begin{aligned} f(y):=2(n+p)\cdot c^{-1}_{n,p}\cdot d^p_Y\big (u(x),u(y)\big ) , \end{aligned}$$

we have, from the definition of approximating energy density,

$$\begin{aligned} \begin{aligned} e^u_{p,\epsilon }(x)&=\int _{B_x(\epsilon )\cap \Omega } \frac{f}{\epsilon ^{n+p}}d{\mathrm{vol}}(y)\\&\leqslant \big (1-\kappa ^n_1(\delta )\big )^{-1}\cdot \int _{B^{\delta }_x\big (\epsilon \cdot (1+\kappa _1(\delta ))\big )}\frac{f}{\epsilon ^{n+p}}d{\mathrm{vol}}_{\delta }(y)\\&= \big (1-\kappa ^n_1(\delta )\big )^{-1}\cdot (1+\kappa _1(\delta ))^{n+p}\cdot e^u_{p,\epsilon \cdot (1+\kappa _1(\delta )),g_\delta }(x)\\&:=\big (1+\kappa _2(\delta )\big )\cdot e^u_{p,\epsilon \cdot (1+\kappa _1(\delta )),g_\delta }(x). \end{aligned} \end{aligned}$$
(4.4)

Similarly, we have

$$\begin{aligned} \begin{aligned} e^u_{p,\epsilon }(x)&\geqslant \big (1+\kappa ^n_1(\delta )\big )^{-1}\cdot (1-\kappa _1(\delta ))^{n+p}\cdot e^u_{p,\epsilon \cdot (1-\kappa _1(\delta )),g_\delta }(x)\\&:=\big (1-\kappa _3(\delta )\big )\cdot e^u_{p,\epsilon \cdot (1-\kappa _1(\delta )),g_\delta }(x). \end{aligned} \end{aligned}$$
(4.5)

Thus

$$\begin{aligned} \begin{aligned}&\big |e^u_{p,\epsilon }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |\\&\quad \leqslant \max \left\{ \begin{array}{c} \kappa _2(\delta ) \cdot e^u_{p,\epsilon (1+\kappa _1(\delta )),g_\delta }(x)+\left| e^u_{p,\epsilon (1+\kappa _1(\delta )),g_\delta }(x)-e^u_{p,\epsilon ,g_\delta }(x)\right| , \\ \kappa _3(\delta ) \cdot e^u_{p,\epsilon (1-\kappa _1(\delta )),g_\delta }(x)+\left| e^u_{p,\epsilon ,g_\delta }(x)-e^u_{p,\epsilon (1-\kappa _1(\delta )),g_\delta }(x)\right| \end{array}\right\} . \end{aligned}\nonumber \\ \end{aligned}$$
(4.6)

Without loss of the generality, we can assume that \(\kappa _1(\delta )<1/3\) for any sufficiently small \(\delta \). Then, from (4.5) and the definition of the approximating energy density,

$$\begin{aligned} \begin{aligned} e^u_{p,\epsilon (1+\kappa _1(\delta )),g_\delta }(x)&\leqslant \big (1-\kappa _3(\delta )\big )^{-1}\cdot e^u_{p,\epsilon (\frac{1+\kappa _1(\delta )}{1-\kappa _1(\delta )})}(x)\\&\leqslant \big (1-\kappa _3(\delta )\big )^{-1}\cdot \Big [2\cdot \frac{1-\kappa _1(\delta )}{1+\kappa _1(\delta )}\Big ]^{n+p}\cdot e^u_{p,2\epsilon }(x)\\&\leqslant \big (1-\kappa _3(\delta )\big )^{-1}\cdot 2^{n+p}\cdot e^u_{p,2\epsilon }(x)\\ \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} e^u_{p,\epsilon (1-\kappa _1(\delta )),g_\delta }(x)&\leqslant \big (1-\kappa _3(\delta )\big )^{-1}\cdot e^u_{p,\epsilon }(x)\\&\leqslant \big (1-\kappa _3(\delta )\big )^{-1}\cdot 2^{n+p}\cdot e^u_{p,2\epsilon }(x). \end{aligned} \end{aligned}$$

By substituting the above two inequalities in Eq. (4.6), we obtain

$$\begin{aligned} \begin{aligned} \big |e^u_{p,\epsilon }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |&\leqslant \kappa _4(\delta ) \cdot e^u_{p,2\epsilon }(x)+\left| e^u_{p,\epsilon (1+\kappa _1(\delta )),g_\delta }(x)-e^u_{p,\epsilon ,g_\delta }(x)\right| \\&\quad +\left| e^u_{p,\epsilon ,g_\delta }(x)-e^u_{p,\epsilon (1-\kappa _1(\delta )),g_\delta }(x)\right| , \end{aligned} \end{aligned}$$

where the function \(\kappa _4(\delta ):=\big (1-\kappa _3(\delta )\big )^{-1}\cdot 2^{n+p}\cdot \max \{\kappa _2(\delta ),\kappa _3(\delta )\}.\) The proof of the Sublemma is finished. \(\square \)

(ii). Uniformly estimate for integral

$$\begin{aligned} \int _B\big |e^u_{p,\epsilon }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |d{\mathrm{vol}}(x). \end{aligned}$$

To deal with this integral, we need to estimate integrals of the right hand side in Eq. (4.3).

Noting that the metric \(g_\delta \) is smooth on \(U_\delta \), The following assertion is summarized in [16], and essentially proved by [52]. Please see the paragraph between Lemma 1 and Lemma 2 on page 3 of [16].

Fact 4.5

The approximating energy densities

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}e^u_{p,\epsilon ,g_\delta }=|\nabla u|_{p,g_\delta }\qquad \mathrm{in}\quad L^1_{\mathrm{loc}}(U_\delta ,g_\delta ). \end{aligned}$$

Now let us continue the proof of this Lemma.

Since the set \(B\subset \subset U_\delta \), from the above Fact 4.5, there exists a constant \(\epsilon _1=\epsilon _1(\delta ,B)\) such that for any \(0<\epsilon <\epsilon _1\), we have

$$\begin{aligned} \int _B\big ||\nabla u|_{p,g_\delta }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |d{\mathrm{vol}}_{\delta }\leqslant \delta . \end{aligned}$$

Hence, by using Eq. (4.2),

$$\begin{aligned} \int _B\big ||\nabla u|_{p,g_\delta }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |d{\mathrm{vol}}\leqslant \delta \cdot \big (1+\kappa _1^n(\delta )\big ):=\kappa _5(\delta ). \end{aligned}$$
(4.7)

Triangle inequality concludes that, for any number \(\epsilon \) with \(0<\epsilon <\frac{\epsilon _1}{1+\kappa _1(\delta )},\)

$$\begin{aligned} \int _B\big |e^u_{p,\epsilon (1+\kappa _1(\delta )),g_\delta }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |d{\mathrm{vol}}(x) \leqslant 2\kappa _5(\delta ) \end{aligned}$$
(4.8)

and

$$\begin{aligned} \int _B\big |e^u_{p,\epsilon ,g_\delta }(x)-e^u_{p,\epsilon (1-\kappa _1(\delta )),g_\delta }(x)\big |d{\mathrm{vol}}(x) \leqslant 2\kappa _5(\delta ). \end{aligned}$$
(4.9)

By using Lemma 3 in [16] (more precisely, the equation (35) in [16]), for any \(\phi \in C_0(U_\delta )\) and any \( \gamma >0\), there exists a constant \(\epsilon _2=\epsilon _2(\gamma ,\phi )\) such that the following estimate holds for any \(0<\epsilon <\epsilon _2\):

$$\begin{aligned} E^u_{p,\epsilon }(\phi )\leqslant E^u_p(\phi )+C\gamma , \end{aligned}$$

where C is a constant independent of \(\gamma \) and \(\epsilon \). Now, since \(B\subset \subset U_\delta \), there exists \(\varphi \in C_0(U_\delta )\ (\subset C_0(\Omega ))\) with \(\varphi |_B=1\) and \(0\leqslant \varphi \leqslant 1\) on \(U_\delta \). Fix such a function \(\varphi \) and a constant \(\gamma _1>0\) with \(C\gamma _1\leqslant 1\). Then for any \(0<\epsilon <\epsilon _3:=\min \{\epsilon _2(\gamma _1,\varphi )/2,\mathrm{dist}(\mathrm{supp \varphi },\partial U_\delta )/10\}\), we have

$$\begin{aligned} \begin{aligned} \int _B e^u_{p,2\epsilon }(x)d{\mathrm{vol}}&\leqslant \int _{U_\delta }\varphi (x)e^u_{p,2\epsilon }(x)d{\mathrm{vol}}\leqslant E^u_{p,2\epsilon }(\varphi ) \leqslant E^u_{p}(\varphi )+1\\&\leqslant E^u_p(\Omega )+1. \end{aligned} \end{aligned}$$
(4.10)

By integrating Eq. (4.3) on B with respect to \({\mathrm{vol}}\) and combining with Eq. (4.8)–(4.10), we obtain that, for any \(0<\epsilon <\min \{\epsilon _3,\epsilon _1/\big (1+\kappa _1(\delta )\big )\}\),

$$\begin{aligned} \int _B\big |e^u_{p,\epsilon }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |d{\mathrm{vol}}(x)\leqslant \kappa _6(\delta ), \end{aligned}$$
(4.11)

where the positive function \(\kappa _6(\delta )=\kappa _4(\delta )\cdot \big (E^u_p(\Omega )+1\big )+4\kappa _5(\delta ).\)

(iii). Uniformly estimate for the desired integral

$$\begin{aligned} \int _B \big |e^u_{p,\epsilon }(x)-|\nabla u|_p(x)\big |d{\mathrm{vol}}(x). \end{aligned}$$

According to Eqs. (4.7) and (4.11), we have, for any sufficiently small \(\epsilon >0\),

$$\begin{aligned} \begin{aligned}&\int _B\big |e^u_{p,\epsilon }(x)-|\nabla u|_p(x)\big |d{\mathrm{vol}}(x)\\&\quad \leqslant \int _B\big |e^u_{p,\epsilon }(x)-e^u_{p,\epsilon ,g_\delta }(x)\big |d{\mathrm{vol}}(x)\\&\qquad +\int _B\big |e^u_{p,\epsilon ,g_\delta }(x)-|\nabla u|_{p,g_\delta }(x)\big |d{\mathrm{vol}}(x)\\&\qquad +\int _B\big ||\nabla u|_{p,g_\delta }(x)-|\nabla u|_p(x)\big |d{\mathrm{vol}}(x)\\&\quad \leqslant \ \kappa _6(\delta )+\kappa _5(\delta )+\int _B\big ||\nabla u|_{p,g_\delta }(x)-|\nabla u|_p(x)\big |d{\mathrm{vol}}(x). \end{aligned} \end{aligned}$$
(4.12)

To estimate the desired integral, we need only to control the last term in above equation. It is implicated by the combination of the uniformly estimate (4.11) and Fact 4.5. We give the argument in detail as follows.

By Eq. (4.2), for any \(\phi \in C_0(U_\delta )\) we have

$$\begin{aligned} \begin{aligned}&\Big |\int _{U_\delta }\phi (x)\cdot \big ( e^u_{p,\epsilon ,g_\delta }-|\nabla u|_{p,g_\delta }\big )d{\mathrm{vol}}(x)\Big |\\&\quad \leqslant \max |\phi |\cdot \int _{W}\big | e^u_{p,\epsilon ,g_\delta }-|\nabla u|_{p,g_\delta }\big |d{\mathrm{vol}}(x)\\&\quad \leqslant \max |\phi |\cdot \int _{W}\big | e^u_{p,\epsilon ,g_\delta }-|\nabla u|_{p,g_\delta }\big |d{\mathrm{vol}}_\delta (x)\cdot \left( 1+\kappa _1^n(\delta )\right) , \end{aligned} \end{aligned}$$

where W is the support set of \(\phi .\) By taking limit as \(\epsilon \rightarrow 0\), and using Fact 4.5, we have, weakly converging as measure

$$\begin{aligned} e^u_{p,\epsilon ,g_\delta }\cdot {\mathrm{vol}}\overset{w}{\rightharpoonup }|\nabla u|_{p,g_\delta }\cdot {\mathrm{vol}}. \end{aligned}$$

Combining with the fact \( e^u_{p,\epsilon }\cdot {\mathrm{vol}}\overset{w}{\rightharpoonup }|\nabla u|_{p}\cdot {\mathrm{vol}},\) we have

$$\begin{aligned} \big (e^u_{p,\epsilon }-e^u_{p,\epsilon ,g_\delta }\big )\cdot {\mathrm{vol}}\overset{w}{\rightharpoonup }\big (|\nabla u|_{p}-|\nabla u|_{p,g_\delta }\big )\cdot {\mathrm{vol}}. \end{aligned}$$

By applying estimate of (4.11) and according the lower semi-continuity of \(L^1\)-norm with respect to weakly converging of measure, we have

$$\begin{aligned} \int _B\big ||\nabla u|_{p}-|\nabla u|_{p,g_\delta }\big |d{\mathrm{vol}}\leqslant \liminf _{\epsilon \rightarrow 0}\int _B\big |e^u_{p,\epsilon }-e^u_{p,\epsilon ,g_\delta }\big |d{\mathrm{vol}}\leqslant \kappa _6(\delta ). \end{aligned}$$

By substituting the estimate into Eq. (4.12), we get

$$\begin{aligned} \int _B\big |e^u_{p,\epsilon }(x)-|\nabla u|_p(x)\big |d{\mathrm{vol}}(x)\leqslant \kappa _5(\delta )+2\kappa _6(\delta ):={\overline{\kappa }}(\delta ). \end{aligned}$$

This completes the proof of the lemma. \(\square \)

Corollary 4.6

Let \(p>1\) and \(u\in W^{1,p}(\Omega ,Y)\). Then, for any sequence of number \(\{\epsilon _j\}_{j=1}^\infty \) converging to 0, there exists a subsequence \(\{\varepsilon _j\}_j\subset \{\epsilon _j\}_j\) such that, for almost everywhere \(x\in \Omega \),

$$\begin{aligned} \lim _{\varepsilon _j\rightarrow 0} e^u_{p,\varepsilon _j}(x)=|\nabla u|_p(x). \end{aligned}$$

Proof

Take any sequence \(\{\delta _j\}_j\) going to 0, and let \(\{B_j\}_j\) be a sequence of open sets such that, for each \(j\in {\mathbb {N}}\),

$$\begin{aligned} B_j\subset \subset \Omega ^{\delta _j} \quad \mathrm{and}\quad {\mathrm{vol}}\left( \Omega ^{\delta _j}\backslash B_j\right) \leqslant \delta _j. \end{aligned}$$

Since the sequence \(\{\epsilon _j\}_j\) tends to 0, we can choose a subsequence \(\{\varepsilon _j\}_j\) of \(\{\epsilon _j\}_j\) such that, for each \(j\in {\mathbb {N}}\), \(\varepsilon _j<{\overline{\epsilon }}(\delta _j,B_j)\), which is the constant given in Lemma 4.3. Hence, we have

$$\begin{aligned} \int _{B_j}\big |e^u_{p,\varepsilon _j}-|\nabla u|_p\big |d{\mathrm{vol}}\leqslant {\overline{\kappa }}(\delta _j),\qquad \forall \ j\in {\mathbb {N}}. \end{aligned}$$

For each \(j\in {\mathbb {N}}\), \({\mathrm{vol}}(\Omega \backslash \Omega ^{\delta _j})=0\). So, the functions \(e^u_{p,\varepsilon _j}\) is measurable on \(\Omega \) for any \(j\in {\mathbb {N}}\). In the following, we will prove that the sequence

$$\begin{aligned} \{f_j:=e^u_{p,\varepsilon _j}\}_j \end{aligned}$$

converges to \(f:=|\nabla u|_p\) in measure on \(\Omega \). Namely, given any number \(\lambda >0\), we will prove

$$\begin{aligned} \lim _{j\rightarrow \infty }{\mathrm{vol}}\big \{x\in \Omega :\ |f_j(x)-f(x)|\geqslant \lambda \big \}=0. \end{aligned}$$

Fix any \(\lambda >0\), we consider the sets

$$\begin{aligned} A_j(\lambda ):=\big \{x\in \Omega \backslash S_M:\ |f_j(x)-f(x)|\geqslant \lambda \big \}. \end{aligned}$$

Noting that \( S_M\) has zero measure (indeed, it has Hausdorff codimension at least two [5]), we need only to show

$$\begin{aligned} \lim _{j\rightarrow \infty }{\mathrm{vol}}\big (A_j(\lambda )\big )=0. \end{aligned}$$

By Chebyshev inequality, we get

$$\begin{aligned} \lambda \cdot {\mathrm{vol}}\big (A_j(\lambda )\cap B_j\big )\leqslant \int _{A_j(\lambda )\cap B_j}|f_j-f|d{\mathrm{vol}}\leqslant \int _{B_j}|f_j-f|d{\mathrm{vol}}\leqslant {\overline{\kappa }}(\delta _j) \end{aligned}$$

for any \(j\in {\mathbb {N}}\). Thus, noting that \(A_j(\lambda )\subset \Omega \backslash S_M \subset \Omega ^{\delta _j}\) for each \(j\in {\mathbb {N}}\), we have

$$\begin{aligned} \begin{aligned} {\mathrm{vol}}\big (A_j(\lambda )\big )&\leqslant {\mathrm{vol}}\big (A_j(\lambda )\cap B_j\big )+ {\mathrm{vol}}\big (A_j(\lambda )\backslash B_j\big )\leqslant \frac{{\overline{\kappa }}(\delta _j)}{\lambda }+{\mathrm{vol}}\big (\Omega ^{\delta _j}\backslash B_j\big )\\&\leqslant \frac{{\overline{\kappa }}(\delta _j)}{\lambda }+\delta _j \end{aligned} \end{aligned}$$

for any \(j\in {\mathbb {N}}\). This implies that \(\lim _{j\rightarrow \infty }{\mathrm{vol}}\big (A_j(\lambda )\big )=0\), and hence, that \(\{f_j\}_j\) converges to f in measure.

Lastly, by F. Riesz theorem, there exists a subsequence of \(\{\varepsilon _j\}_j\), denoted by \(\{\varepsilon _j\}_j\) again, such that the sequence \(\{e^u_{p,\varepsilon _j}\}_j\) converges to \(|\nabla u|_p\) almost everywhere in \(\Omega .\) \(\square \)

The above pointwise converging provides the following mean value property, which will be used later.

Corollary 4.7

Let \(p>1\) and \(u\in W^{1,p}(\Omega ,Y)\). Then, for any sequence of number \(\{\epsilon _j\}_{j=1}^\infty \) converging to 0, there exists a subsequence \(\{\varepsilon _j\}_j\subset \{\epsilon _j\}_j\) such that for almost everywhere \(x_0\in \Omega \), we have the following mean value property:

$$\begin{aligned} \int _{B_{x_0}(\varepsilon _j)}d^p_Y\big (u(x_0),u(x)\big )d{\mathrm{vol}}(x)=\frac{c_{n,p}}{n+p}|\nabla u|_p(x_0)\cdot \varepsilon _j^{n+p}+o\left( \varepsilon _j^{n+p}\right) .\nonumber \\ \end{aligned}$$
(4.13)

Proof

According to the previous Corollary 4.6, there exists a subsequence \(\{\varepsilon _j\}_j\subset \{\epsilon _j\}_j\) such that

$$\begin{aligned} \lim _{\varepsilon _j\rightarrow 0} e^u_{p,\varepsilon _j}(x_0)=|\nabla u|_p(x_0)\quad \mathrm{for\ almost\ all}\ x_0\in \Omega . \end{aligned}$$

Fix such a point \(x_0\). By the definition of approximating energy density, we get

$$\begin{aligned} \frac{n+p}{c_{n,p}\cdot \varepsilon _j^n}\int _{B_{x_0}(\varepsilon _j)}d^p_Y\big (u(x_0),u(x)\big )d{\mathrm{vol}}(x)=|\nabla u|_p(x_0)\cdot \varepsilon _j^p+o\left( \varepsilon _j^p\right) . \end{aligned}$$

The proof is finished. \(\square \)

5 Pointwise Lipschitz constants

Let \(\Omega \) be a bounded domain of an Alexandrov space with curvature \(\geqslant k\) for some \(k\leqslant 0\). In this section, we will established an estimate for pointwise Lipschitz constants of harmonic maps from \(\Omega \) into a complete, non-positively curved metric space \((Y,d_Y)\).

Let us first review the concept of metric spaces with (global) non-positive curvature in the sense of Alexandrov.

5.1 NPC spaces

Definition 5.1

(see, for example, [3]) A geodesic space \((Y,d_Y)\) is said to have global non-positive curvature in the sense of Alexandrov, denoted by NPC, if the following comparison property is to hold: Given any triangle \(\triangle PQR\subset Y\) and point \(S\in QR\) with

$$\begin{aligned} d_Y(Q,S)=d_Y(R,S)= \frac{1}{2} d_Y(Q,R), \end{aligned}$$

there exists a comparison triangle \(\triangle {\bar{P}}{\bar{Q}}{\bar{R}}\) in Euclidean plane \({\mathbb {R}}^2\) and point \({\bar{S}}\in {\bar{Q}}{\bar{R}}\) with

$$\begin{aligned} |{\bar{Q}}{\bar{S}}|=|{\bar{R}}{\bar{S}}|=\frac{1}{2}|{\bar{Q}}{\bar{R}}| \end{aligned}$$

such that

$$\begin{aligned} d_Y(P,S)\leqslant |{\bar{P}}{\bar{S}}|. \end{aligned}$$

It is also called a CAT(0) space.

The following lemma is a special case of Corollary 2.1.3 in [33].

Lemma 5.2

Let \((Y,d_Y)\) be an NPC space. Take any ordered sequence \(\{P,Q,R,S\}\subset Y\), and let point \(Q_{m}\) be the mid-point of QR. we denote the distance \(d_Y(A,B)\) abbreviatedly by \(d_{AB}.\) Then we have

$$\begin{aligned} \left( d_{PS}-d_{QR}\right) \cdot d_{QR}\geqslant \left( d^2_{PQ_{m}}-d^2_{PQ}-d^2_{Q_{m}Q}\right) +\left( d^2_{SQ_{m}}-d^2_{SR}-d^2_{Q_{m}R}\right) .\nonumber \\ \end{aligned}$$
(5.1)

Proof

Taking \(t=1/2\) and \(\alpha =1\) in Equation (2.1v) in Corollary 2.1.3 of [33], we get

$$\begin{aligned} d^2_{PQ_{m}}+d^2_{SQ_{m}}\leqslant d^2_{PQ}+d^2_{RS}-\frac{1}{2}d^2_{QR}+d_{PS}\cdot d_{QR}. \end{aligned}$$

Since

$$\begin{aligned} d_{QR}=2d_{Q_{m}Q}=2d_{Q_{m}R}, \end{aligned}$$

we have

$$\begin{aligned} d_{PS}\cdot d_{QR}-d^2_{QR}\geqslant \left( d^2_{PQ_{m}}-d^2_{PQ}-d^2_{Q_{m}Q}\right) +\left( d^2_{SQ_{m}}-d^2_{SR}-d^2_{Q_{m}R}\right) . \end{aligned}$$

This is Eq. (5.1). \(\square \)

5.2 Harmonic maps

Let \(\Omega \) be a bounded domain in an Alexandrov space \((M,|\cdot ,\cdot |)\) and let Y be an NPC space. Given any \(\phi \in W^{1,2}(\Omega ,Y)\), we set

$$\begin{aligned} W^{1,2}_\phi (\Omega ,Y):=\big \{u\in W^{1,2}(\Omega ,Y):\ d_Y\big (u(x),\phi (x)\big )\in W^{1,2}_0(\Omega ,{\mathbb {R}})\big \}. \end{aligned}$$

Using the variation method in [27, 39], (by the lower semi-continuity of energy), there exists a unique \(u\in W^{1,2}_\phi (\Omega ,Y)\) which is minimizer of energy \(E_2^u\). That is, the energy \(E_2^u:=E^u_2(\Omega )\) of u satisfies

$$\begin{aligned} E_2^u=\inf _w\big \{E_2^w:\ w\in W^{1,2}_\phi (\Omega ,Y)\big \}. \end{aligned}$$

Such an energy minimizing map is called a harmonic map.

Lemma 5.3

(Jost [27], Lin [39]) Let \(\Omega \) be a bounded domain in an Alexandrov space \((M,|\cdot ,\cdot |)\) and let Y be an NPC space. Suppose that u is a harmonic map from \(\Omega \) to Y. Then the following two properties are satisfied:

  1. (i)

    The map u is locally Hölder continuous on \(\Omega \);

  2. (ii)

    (Lemma 5 in [27], see also Lemma 10.2 of [11] for harmonic maps between Riemannian polyhedra) For any \(P\in Y\), the function

    $$\begin{aligned} f_P(x):=d_Y\big (u(x),P\big )\ \ \ \big (\in W^{1,2}(\Omega )\big ) \end{aligned}$$

    satisfies \(f^2_P\in W^{1,2}_{\mathrm{loc}}(\Omega )\) andFootnote 5

    $$\begin{aligned} {\mathscr {L}}_{f^2_P}\geqslant 2E^u_2\geqslant 2 |\nabla u|_2\cdot {\mathrm{vol}}. \end{aligned}$$

According to this Lemma, we always assume that a harmonic map form \(\Omega \) into an NPC space is continuous in \(\Omega \).

5.3 Estimates for pointwise Lipschitz constants

Let u be a harmonic map from a bounded domain \(\Omega \) of an Alexandrov space \((M,|\cdot ,\cdot |)\) to an NPC space \((Y,d_Y)\). In this subsection, we will estimate the pointwise Lipchitz constant of u, that is,

$$\begin{aligned} \mathrm{Lip}u(x):=\limsup _{y\rightarrow x}\frac{d_Y\big (u(x),u(y)\big )}{|xy|}=\limsup _{r\rightarrow 0}\sup _{|xy|\leqslant r}\frac{d_Y\big (u(x),u(y)\big )}{r}. \end{aligned}$$

It is convenient to consider the function \(f:\Omega \times \Omega \rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} f(x,y):=d_Y\big (u(x),u(y)\big ), \end{aligned}$$
(5.2)

where \(\Omega \times \Omega \subset M\times M\), which is equipped the product metric defined as

$$\begin{aligned} |(x,y),(z,w)|_{M\times M}^2:=|xz|^2+|yw|^2\qquad \mathrm{for~any}\quad x,y,z,w\in M. \end{aligned}$$

Recall that \((M\times M,|\cdot , \cdot |_{M\times M})\) is also an Alexandrov space. The geodesic balls in \(M\times M\) are denoted by

$$\begin{aligned} B^{M\times M}_{(x,y)}(r):=\{(z,w):\ |(z,w),(x,y)|_{M\times M}<r\}. \end{aligned}$$

Proposition 5.4

Let \(\Omega ,Y\) and uf be as the above. Then the function f is sub-solution of \({\mathscr {L}}^{(2)}_f=0\) on \(\Omega \times \Omega \), where \({\mathscr {L}}^{(2)}\) is the Laplacian on \(\Omega \times \Omega \) (because \(M\times M\) is also an Alexandrov space, the notion \({\mathscr {L}}^{(2)}\) makes sense).

Proof

We divide the proof into three steps.

(i) For any \(P\in Y\), we firstly prove that the functions \(f_P(x):=d_Y\big (u(x),P\big )\) satisfy \({\mathscr {L}}_{f_P}\geqslant 0\) on \(\Omega \).

Take any \(\epsilon >0\) and set

$$\begin{aligned} f_\epsilon (x):=\sqrt{f^2_P(x)+\epsilon ^2}. \end{aligned}$$

We have

$$\begin{aligned} |\nabla f_\epsilon |=\frac{f_P}{f_\epsilon }\cdot |\nabla f_P|\leqslant |\nabla f_P|. \end{aligned}$$

Thus, we have \(f_\epsilon \in W^{1,2}(\Omega )\), since \(f_P\in W^{1,2}(\Omega )\). We will prove that, for any \(\epsilon >0\), \({\mathscr {L}}_{f_\epsilon }\) forms a nonnegative Radon measure.

From Proposition 4.1 (1) and (5), we get that \(f_P\in W^{1,2}(\Omega )\) and

$$\begin{aligned} E^u_2\geqslant E^{f_P}_2=|\nabla f_P|^2\cdot {\mathrm{vol}}. \end{aligned}$$

By combining with Lemma 5.3 (ii),

$$\begin{aligned} {\mathscr {L}}_{f^2_\epsilon }={\mathscr {L}}_{f^2_P}\geqslant 2E^u_2\geqslant 2|\nabla f_P|^2\cdot {\mathrm{vol}}\geqslant 2|\nabla f_\epsilon |^2\cdot {\mathrm{vol}}. \end{aligned}$$
(5.3)

Take any test function \(\phi \in Lip_0(\Omega )\) with \(\phi \geqslant 0\). By using

$$\begin{aligned} -{\mathscr {L}}_{f^2_\epsilon }(\phi )= & {} \int _\Omega \left<{\nabla f^2_\epsilon },{\nabla \phi }\right>d{\mathrm{vol}}=2\int _\Omega \left<{\nabla f_\epsilon },{\nabla (f_\epsilon \cdot \phi )}\right>d{\mathrm{vol}}\\&-2\int _\Omega \phi \cdot |\nabla f_\epsilon |^2d{\mathrm{vol}}, \end{aligned}$$

and combining with Eq. (5.3), we obtain that the functional

$$\begin{aligned} I_\epsilon (\phi ):= -\int _\Omega \left<{\nabla f_\epsilon },{\nabla (f_\epsilon \cdot \phi )}\right>d{\mathrm{vol}}={\mathscr {L}}_{f_\epsilon }(f_\epsilon \cdot \phi ) \end{aligned}$$

on \(Lip_0(\Omega )\) is nonnegative. According to the Theorem 2.1.7 of [21], there exists a (nonnegative) Radon measure, denoted by \(\nu _\epsilon \), such that

$$\begin{aligned} \nu _\epsilon (\phi )=I_\epsilon (\phi )={\mathscr {L}}_{f_\epsilon }(f_\epsilon \cdot \phi ). \end{aligned}$$

This implies that, for any \(\psi \in Lip_0(\Omega )\) with \(\psi \geqslant 0\),

$$\begin{aligned} {\mathscr {L}}_{f_\epsilon }(\psi )=\nu _\epsilon \left( \frac{\psi }{f_\epsilon }\right) \geqslant 0. \end{aligned}$$

Thus, we get that \({\mathscr {L}}_{f_\epsilon }\) is a nonnegative functional on \(Lip_0(\Omega )\), and hence, by using the Theorem 2.1.7 of [21] again, it forms a nonnegative Radon measure.

Now let us prove the sub-harmonicity of \(f_P\). Noting that, for any \(\epsilon >0\),

$$\begin{aligned} |\nabla f_\epsilon |\leqslant |\nabla f_P|\qquad \mathrm{and} \quad 0< f_\epsilon \leqslant f_P+\epsilon , \end{aligned}$$

we get that the set \(\{f_\epsilon \}_{\epsilon >0}\) is bounded uniformly in \(W^{1,2}(\Omega ).\) Hence, it is weakly compact. Then there exists a sequence of numbers \(\epsilon _j\rightarrow 0\) such that

$$\begin{aligned} f_{\epsilon _j}\overset{w}{\rightharpoonup }f_P\quad \mathrm{in\ }\ W^{1,2}(\Omega ). \end{aligned}$$

Therefore, the sub-harmonicity of \(f_{\epsilon _j}\) for any \(j\in {\mathbb {N}}\) implies that \(f_P\) is sub-harmonic. This completes the proof of (i).

(ii) We next prove that f is in \(W^{1,2}(\Omega \times \Omega )\).

Let us consider the approximating energy density of f at point \((x,y)\in \Omega \times \Omega \). Fix any positive number \(\epsilon \) with \(B_x(2\epsilon )\subset \Omega \) and \(B_y(2\epsilon )\subset \Omega \). By the definition of approximating energy density, the triangle inequality, and by noting that the ball in \(\Omega \times \Omega \) satisfying

$$\begin{aligned} B^{M\times M}_{(x,y)}(\epsilon )\subset B_x(\epsilon )\times B_y(\epsilon )\subset \Omega \times \Omega , \end{aligned}$$

we have

$$\begin{aligned}&\frac{c_{2n,2}}{2n+2}\cdot e^f_{2,\epsilon }(x,y)\\&\quad =\int _{B^{M\times M}_{(x,y)}(\epsilon )}\frac{|f(x,y)-f(z,w)|^2}{\epsilon ^{2n+2}}d{\mathrm{vol}}(z){\mathrm{vol}}(w)\\&\quad \leqslant \int _{B_x(\epsilon )\times B_y(\epsilon )}\frac{\big [d_Y\big (u(x),u(z)\big )+d_Y\big (u(y),u(w)\big )\big ]^2}{\epsilon ^{2n+2}}d{\mathrm{vol}}(z)d{\mathrm{vol}}(w)\\&\quad \leqslant 2\cdot {\mathrm{vol}}\big (B_y(\epsilon )\big )\cdot \int _{B_x(\epsilon )}\frac{d_Y\big (u(x),u(z)\big )^2}{\epsilon ^{2n+2}}d{\mathrm{vol}}(z)\\&\qquad +2\cdot {\mathrm{vol}}\big (B_x(\epsilon )\big )\cdot \int _{B_y(\epsilon )}\frac{d_Y\big (u(y),u(w)\big )^2}{\epsilon ^{2n+2}}d{\mathrm{vol}}(w)\\&\quad \leqslant 2\frac{{\mathrm{vol}}\big (B_y(\epsilon )\big )}{\epsilon ^n}\cdot \frac{c_{n,2}}{n+2}e^u_{2,\epsilon }(x)+2\frac{{\mathrm{vol}}\big (B_x(\epsilon )\big )}{\epsilon ^n}\cdot \frac{c_{n,2}}{n+2}e^u_{2,\epsilon }(y)\\&\quad \leqslant c_{n,k,\mathrm{diameter}(\Omega )} \cdot \big (e^u_{2,\epsilon }(x)+ e^u_{2,\epsilon }(y)\big ). \end{aligned}$$

Then, by the definition of energy functional, it is easy to see that f has finite energy. Hence f is in \(W^{1,2}(\Omega \times \Omega )\).

(iii) We want to prove that f is sub-harmonic on \(\Omega \times \Omega \).

For any \(g\in W^{1,2}(\Omega \times \Omega )\), by Fubini’s Theorem, we conclude that, for almost all \(x\in \Omega \), the functions \(g_x(\cdot ):=g(x,\cdot )\) are in \(W^{1,2}(\Omega )\), and that the same assertions hold for the functions \(g_y(\cdot ):=g(\cdot ,y)\). We denote by \(\nabla ^{M\times M}g\) the weak gradient of g. Note that the metric on \(M \times M\) is the product metric, we have

$$\begin{aligned} \left<{\nabla _{M\times M} g},{\nabla _{M\times M} h}\right>(x,y)=\left<{\nabla _1 g},{\nabla _1 h}\right>+\left<{\nabla _2 g},{\nabla _2 h}\right>, \end{aligned}$$

for any \(g,h\in W^{1,2}(\Omega \times \Omega )\), where \(\nabla _1g\) is the weak gradient of the function \(g_y(\cdot ):=g(\cdot ,y):\Omega \rightarrow {\mathbb {R}},\) and \(\nabla _2g\) is similar.

Now, we are in the position to prove sub-harmonicity of f. Take any test function \(\varphi (x,y)\in Lip_0(\Omega \times \Omega )\) with \(\varphi (x,y)\geqslant 0\).

$$\begin{aligned} \begin{aligned}&\int _{\Omega \times \Omega }\left<{\nabla _{M\times M} f},{\nabla _{M\times M} \varphi }\right>_{(x,y)}d{\mathrm{vol}}(x)d{\mathrm{vol}}(y)\\&\quad =\int _{\Omega }\int _{\Omega }\left<{\nabla _1 f},{\nabla _1 \varphi }\right>d{\mathrm{vol}}(x)d{\mathrm{vol}}(y)\\&\qquad +\int _{\Omega }\int _{\Omega }\left<{\nabla _2 f},{\nabla _2 \varphi }\right>d{\mathrm{vol}}(y)d{\mathrm{vol}}(x). \end{aligned} \end{aligned}$$
(5.4)

Fix \(y\in \Omega \) and note that the function \(\varphi _y(\cdot ):=\varphi (\cdot ,y)\in Lip_0(\Omega )\). According to (i), the function \(f_{u(y)}:=d_Y\big (u(\cdot ),u(y)\big )\) is sub-harmonic on \(\Omega \). Hence, we have

$$\begin{aligned} \int _{\Omega }\left<{\nabla _1 f},{\nabla _1 \varphi }\right>d{\mathrm{vol}}(x)=-{\mathscr {L}}_{f_{u(y)}} (\varphi _y(\cdot ))\leqslant 0. \end{aligned}$$

By the same argument, we get for any fixed \(x\in \Omega \),

$$\begin{aligned} \int _{\Omega }\left<{\nabla _2 f},{\nabla _2 \varphi }\right>d{\mathrm{vol}}(y)\leqslant 0. \end{aligned}$$

By substituting these above two inequalities into Eq. (5.4), we have

$$\begin{aligned} \int _{\Omega \times \Omega }\left<{\nabla _{M\times M} f},{\nabla _{M\times M} \varphi }\right>_{(x,y)}d{\mathrm{vol}}(x)d{\mathrm{vol}}(y)\leqslant 0, \end{aligned}$$

for any function \(\varphi \in Lip_0(\Omega \times \Omega )\). This implies that f is sub-harmonic on \(\Omega \times \Omega \). The proof of the proposition is completed. \(\square \)

Now we can establish the following estimates for pointwise Lipschitz constants of harmonic maps.

Theorem 5.5

Let \(\Omega \) be a bounded domain in an n-dimensional Alexandrov space \((M,|\cdot ,\cdot |)\) with curvature \(\geqslant k\) for some \(k\leqslant 0\), and let Y be an NPC space. Suppose that u is a harmonic map from \(\Omega \) to Y. Then, for any ball \(B_q(R)\subset \subset \Omega \), there exists a constant C(nkR), depending only on nk and R, such that the following estimate holds:

$$\begin{aligned} \mathrm{Lip}^2u(x)\leqslant C(n,k,R)\cdot |\nabla u|_{2}(x)<+\infty \end{aligned}$$
(5.5)

for almost everywhere \(x\in B_q(R/6)\), where \(|\nabla u|_2\) is the density of the absolutely continuous part of energy measure \(E^u_2\) with respect to \({\mathrm{vol}}\).

Proof

Fix any ball \(B_q(R)\subset \subset \Omega \). Throughout this proof, all of constants \(C_1,C_2,\ldots \) depend only on nk and R.

Note that \(M\times M\) has curvature lower bound \(\min \{k,0\}=k\), and that \(\mathrm{diam}(B_q(R)\times B_q(R))= \sqrt{2} R\). Clearly, on \(B_q(R)\times B_q(R)\), both the measure doubling property and the (weak) Poincaré inequality hold, with the corresponding doubling and Poincaré constants depending only on nk and R. On the other hand, from Proposition 5.4, the function

$$\begin{aligned} f(x,y):=d_Y\big (u(x),u(y)\big ) \end{aligned}$$

is sub-harmonic on \(B_q(R)\times B_q(R)\). By Theorem 8.2 of [2], (or a Nash–Moser iteration argument), there exists a constant \(C_1\) such that

$$\begin{aligned} \sup _{B^{M\times M}_{(x,y)}(r) }f\leqslant C_1\cdot \bigg (\fint _{B^{M\times M}_{(x,y)}(2r)}f^2d{\mathrm{vol}}_{M\times M}\bigg )^{\frac{1}{2}} \end{aligned}$$

for any \((x,y)\in B_q(R/2)\times B_q(R/2)\) and any \(r>0\) with \(B^{M\times M}_{(x,y)}(2r)\subset \subset B_q(R)\times B_q(R),\) where, for any function \(h\in L^1(E)\) on a measurable set E,

$$\begin{aligned} \fint _Ehd{\mathrm{vol}}:=\frac{1}{{\mathrm{vol}}(E)}\int _Ehd{\mathrm{vol}}. \end{aligned}$$

In particular, for any fixed \(z\in B_q(R/2)\) and any \(r>0\) with \(B_z(2r)\subset B_q(R)\), by noting that

$$\begin{aligned} B_z(r/2)\times B_z(r/2)\subset B^{M\times M}_{(z,z)}(r)\quad \hbox { and } \quad B^{M\times M}_{(z,z)}(2r)\subset B_z(2r)\times B_z(2r), \end{aligned}$$

we have

$$\begin{aligned} \sup _{y\in B_z(r/2)}f^2(y,z)\leqslant & {} \sup _{B_z(r/2)\times B_z(r/2)}f^2\nonumber \\\leqslant & {} \frac{ C^2_1}{{\mathrm{vol}}\big (B^{M\times M}_{(z,z)}(2r)\big )} \int _{B_z(2r)\times B_z(2r)} f^2d{\mathrm{vol}}_{M\times M}.\qquad \end{aligned}$$
(5.6)

From Proposition 4.1 (4), there exists constant \(C_2\) such that the following holds: for any \(z\in B_q(R/6)\) and any \(0<r<R/4\), we have

$$\begin{aligned} \int _{B_z(2r)}\int _{B_z(2r)}f^2 (x,y)d{\mathrm{vol}}(x)d{\mathrm{vol}}(y)\leqslant C_2r^{n+2}\cdot \int _{B_z(12 r)}dE^u_2. \end{aligned}$$

By combining with Eq. (5.6), we get for any \(z\in B_q(R/6)\)

$$\begin{aligned} \sup _{y\in B_z(r/2)}\frac{f^2(y,z)}{r^2}\leqslant C^2_1\cdot C_2\cdot \frac{r^n\cdot {\mathrm{vol}}\big (B_z(12 r)\big )}{{\mathrm{vol}}\big (B^{M\times M}_{(z,z)}(2r)\big )} \fint _{B_z(12 r)}dE^u_2 \end{aligned}$$
(5.7)

for any \(0<r<R/4\). Noticing that \(B_z(r)\times B_z(r)\subset B^{M\times M}_{(z,z)}(2r)\) again, according to the Bishop–Gromov volume comparison [5], we have

$$\begin{aligned} \frac{r^n\cdot {\mathrm{vol}}\big (B_z(12 r)\big )}{{\mathrm{vol}}\big (B^{M\times M}_{(z,z)}(2r)\big )}\leqslant \frac{r^n}{{\mathrm{vol}}\big (B_{z}(r)\big ) }\cdot \frac{{\mathrm{vol}}\big (B_z(12 r)\big )}{{\mathrm{vol}}\big (B_{z}(r)\big )}\leqslant C_3\cdot \frac{r^n}{{\mathrm{vol}}\big (B_{z}(r)\big ) } \end{aligned}$$

for any \(0<r<R/4\). Hence, by using this and the Eq. (5.7), we obtain that, for any \(z\in B_q(R/6)\),

$$\begin{aligned} \sup _{y\in B_z(r/2)}\frac{f^2(y,z)}{r^2}\leqslant C_4\cdot \frac{r^n}{{\mathrm{vol}}\big (B_{z}(r)\big ) } \cdot \fint _{B_z(12 r)}dE^u_2 \end{aligned}$$

for any \(0<r<R/4\), where \(C_4:=C^2_1\cdot C_2\cdot C_3\). Therefore, we conclude that

$$\begin{aligned} \begin{aligned} \mathrm{Lip}^2u(z)&=\limsup _{r\rightarrow 0}\sup _{|yz|\leqslant r/4}\frac{f^2(y,z)}{(r/4)^2}\leqslant 16\cdot \limsup _{r\rightarrow 0}\sup _{|yz|< r/2}\frac{f^2(y,z)}{r^2}\\&\leqslant 16 C_4\cdot \limsup _{r\rightarrow 0} \frac{r^n}{{\mathrm{vol}}\big (B_{z}(r)\big ) }\cdot \limsup _{r\rightarrow 0} \fint _{B_z(12 r)}dE^u_2 \end{aligned} \end{aligned}$$
(5.8)

for any \(z\in B_q(R/6)\). According to the Lebesgue decomposition theorem (see, for example, Section 1.6 in [12]), we know that, for almost everywhere \(x\in B_q(R/6)\), the limit \(\lim _{r\rightarrow 0} \fint _{B_x(r)}dE^u_2\) exists and

$$\begin{aligned} \lim _{r\rightarrow 0} \fint _{B_x(r)}dE^u_2=|\nabla u|_2(x). \end{aligned}$$
(5.9)

On the other hand, from [5], we know that

$$\begin{aligned} \lim _{r\rightarrow 0} \frac{r^n}{{\mathrm{vol}}(B_{x}(r)) }=n/\omega _{n-1} \end{aligned}$$
(5.10)

for any regular point \(x\in B_q(R/6)\) and that the set of regular points in an Alexandrov space has full measure. Thus, (5.10) holds for almost all \(x\in B_q(R/6)\). By using this and (5.8)–(5.10), we get the estimate (5.5). \(\square \)

Consequently, we have the following mean value inequality.

Corollary 5.6

Let \(\Omega \) be a bounded domain in an n-dimensional Alexandrov space \((M,|\cdot ,\cdot |)\) and let Y be an NPC space. Suppose that u is a harmonic map from \(\Omega \) to Y. Then, for almost everywhere \(x_0\in \Omega \), we have the following holds:

$$\begin{aligned} \int _{B_{x_0}(R)}\Big [ d^2_Y\big (P,u(x_0)\big )-d^2_Y\big (P,u(x)\big )\Big ] d{\mathrm{vol}}(x)\leqslant & {} -\frac{|\nabla u|_2(x_0)\cdot \omega _{n-1}}{n(n+2)}\cdot R^{n+2}\\&+o(R^{n+2}). \end{aligned}$$

for every \(P\in Y\).

Proof

We define a subset of \(\Omega \) as

$$\begin{aligned} A:= & {} \big \{x\in \Omega |\ x\ \ \text {is smooth},\ \text {Lip}u(x)<+\infty ,\\&\text { and } x\ \text {is a Lebesgue point of } |\nabla u|_2 \big \}. \end{aligned}$$

According to the above Theorem 5.5 and [45], we have \({\mathrm{vol}}(\Omega \backslash A)=0\).

Fix any point \(x_0\in A.\) For any \(P\in Y\), we consider the function on \(\Omega \)

$$\begin{aligned} g_{x_0,P}(x):=d^2_Y\big (P,u(x_0)\big )-d^2_Y\big (P,u(x)\big ). \end{aligned}$$

Then, from Lemma 5.3 (ii), we have

$$\begin{aligned} {\mathscr {L}}_{g_{x_0,P}}\leqslant -2E_2^u\leqslant -2|\nabla u|_2\cdot {\mathrm{vol}}. \end{aligned}$$

Since \(x_0\) is a Lebesgue point of the function \(-2|\nabla u|_2\), by applying Proposition 3.2 to nonnegative function (note that \(-2|\nabla u|_2\leqslant 0\)),

$$\begin{aligned} g_{x_0,P}(x)-\inf _{B_{x_0}(R)} g_{x_0,P}(x), \end{aligned}$$

we obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{H^{n-1}(\partial B_o(R)\subset T^k_{x_0})}\int _{\partial B_{x_0}(R)}\Big [g_{x_0,P}(x)-\inf _{B_{x_0}(R)} g_{x_0,P}(x)\Big ]d{\mathrm{vol}}\\&\quad \leqslant \left[ g_{x_0,P}(x_0)-\inf _{B_{x_0}(R)} g_{x_0,P}(x)\right] -\frac{2|\nabla u|_2(x_0)}{2n}\cdot R^{2}+o(R^{2}). \end{aligned} \end{aligned}$$

Denote by

$$\begin{aligned} A(R):={\mathrm{vol}}\big (\partial B_{x_0}(R)\subset M\big )\ \hbox { and } \quad \overline{A}(R):=H^{n-1}\big (\partial B_o(R)\subset T^k_{x_0}\big ). \end{aligned}$$

Noting that \(g_{x_0,P}(x_0)=0\), we have

$$\begin{aligned} \int _{\partial B_{x_0}(R)}g_{x_0,P}(x)d{\mathrm{vol}}\leqslant & {} -\inf _{B_{x_0}(R)}g_{x_0,P}(x)\cdot \Big ({\overline{A}}(R)-A(R)\Big )\nonumber \\&-\Big (\frac{|\nabla u|_2(x_0)}{n}\cdot R^{2}+o(R^{2})\Big )\cdot {\overline{A}}(R).\quad \end{aligned}$$
(5.11)

By applying co-area formula, integrating two sides of Eq. (5.11) on (0, R), we have

$$\begin{aligned} \int _{ B_{x_0}(R)} g_{x_0,P}(x)d{\mathrm{vol}}= & {} \int _0^R \int _{\partial B_{x_0}(r)} g_{x_0,P}(x)d{\mathrm{vol}}\nonumber \\\leqslant & {} -\int _0^R\inf _{B_{x_0}(r)}g_{x_0,P}(x)\cdot \Big ({\overline{A}}(r)-A(r)\Big )dr\nonumber \\&\ -\int _0^R\Big (\frac{|\nabla u|_2(x_0)}{n}\cdot r^{2}+o(r^{2})\Big )\cdot {\overline{A}}(r)dr\nonumber \\:= & {} I(R)+II(R). \end{aligned}$$
(5.12)

Since M has curvature \(\geqslant k\), the Bishop–Gromov inequality states that \(A(r)\leqslant {\overline{A}}(r)\) for any \(r>0\). Hence we have

$$\begin{aligned} \inf _{B_{x_0}(r)}g_{x_0,P}(x)\cdot \Big ({\overline{A}}(r)-A(r)\Big )\geqslant \inf _{B_{x_0}(R)}g_{x_0,P}(x)\cdot \Big ({\overline{A}}(r)-A(r)\Big ) \end{aligned}$$

for any \(0\leqslant r\leqslant R\). So we obtain

$$\begin{aligned} I(R)\leqslant & {} -\inf _{B_{x_0}(R)}g_{x_0,P}(x)\cdot \int _0^R\Big ({\overline{A}}(r)-A(r)\Big )dr\nonumber \\= & {} -\inf _{B_{x_0}(R)}g_{x_0,P}(x)\cdot \Big (H^{n}\big ( B_o(R)\subset T^k_{x_0}\big )\!-\!{\mathrm{vol}}\big ( B_{x_0}(R)\big )\Big ).\nonumber \\ \end{aligned}$$
(5.13)

By \(\mathrm{Lip}u(x_0)<+\infty \) and the triangle inequality, we have

$$\begin{aligned}&|g_{x_0,P}(x)|\nonumber \\&\quad =\Big (d_Y\big (P,u(x_0)\big )+d_Y\big (P,u(x)\big )\Big )\cdot \big |d_Y\big (P,u(x_0)\big )-d_Y\big (P,u(x)\big )\big |\nonumber \\&\quad \leqslant \Big (2d_Y\big (P,u(x_0)\big )+d_Y\big (u(x_0),u(x)\big )\Big )\cdot d_Y\big (u(x),u(x_0)\big ) \nonumber \\&\quad \leqslant \Big (2d_Y\big (P,u(x_0)\big )+\mathrm{Lip}u(x_0)\cdot R+o(R)\Big )\cdot \big (\mathrm{Lip}u(x_0)\cdot R+o(R)\big )\nonumber \\&\quad =O(R). \end{aligned}$$
(5.14)

Since \(x_0\) ia a smooth point, from Lemma 2.5, we have

$$\begin{aligned} \big |H^{n}\big ( B_o(R) \subset T_{x_0}\big )-{\mathrm{vol}}\big ( B_{x_0}(R)\big )\big |\leqslant o(R)\cdot H^{n}\big ( B_o(R) \subset T_{x_0}\big )=o(R^{n+1}). \end{aligned}$$

By using the fact that \(x_0\) is smooth again, and hence \(T^k_{x_0}\) is isometric \({\mathbb {M}}^n_k\), we have

$$\begin{aligned} \begin{aligned} \big |H^{n}\big ( B_o(R) \subset T^k_{x_0}\big )-H^{n}\big (B_o(R) \subset T_{x_0}\big )\big |&=\big |H^{n}\big ( B_o(R) \subset {\mathbb {M}}^n_k\big )\\&\quad -H^{n}\big ( B_o(R) \subset {\mathbb {R}}^n\big )\big |\\&=O(R^{n+2}). \end{aligned} \end{aligned}$$

By substituting the above two estimates and (5.14) into (5.13), we obtain

$$\begin{aligned} I(R)\leqslant o(R^{n+2}). \end{aligned}$$
(5.15)

Now let us estimate II(R). Note that \(x_0\) is a smooth point. In particular, it is a regular point. Hence

$$\begin{aligned} {\overline{A}}(r)={\mathrm{vol}}(\Sigma _{x_0})\cdot s_k^{n-1}(r)=\omega _{n-1}r^{n-1}+o(r^{n-1}). \end{aligned}$$

We have

$$\begin{aligned} \begin{aligned} II(R)&=-\int _0^R\left( \frac{|\nabla u|_2(x_0)}{n}\cdot r^{2}+o(r^{2})\right) \cdot {\overline{A}}(r)dr\\&=-\frac{|\nabla u|_2(x_0)\cdot \omega _{n-1}}{n}\int _0^R\big ( r^{n+1}+o(r^{n+1})\big ) dr\\&=-\frac{|\nabla u|_2(x_0)\cdot \omega _{n-1}}{n(n+2)}\cdot R^{n+2}+o(R^{n+2}). \end{aligned} \end{aligned}$$
(5.16)

The combination of Eqs. (5.12) and (5.15)–(5.16), we have

$$\begin{aligned} \int _{ B_{x_0}(R)} g_{x_0,P}(x)d{\mathrm{vol}}\leqslant -\frac{|\nabla u|_2(x_0)\cdot \omega _{n-1}}{n(n+2)}\cdot R^{n+2}+o(R^{n+2}). \end{aligned}$$

This is desired estimate. Hence we complete the proof.\(\square \)

6 Lipschtz regularity

We will prove the main Theorem 1.4 in this section. The proof is split into two steps, which are contained in the following two subsections. In the first subsection, we will construct a family of auxiliary functions \(f_t(x,\lambda )\) and prove that they are super-solutions of the heat equation (see Proposition 6.13). In the second subsection, we will complete the proof.

Let \(\Omega \) be a bounded domain in an n-dimensional Alexandrov space \((M,|\cdot ,\cdot |)\) with curvature \(\geqslant k\) for some number \(k\leqslant 0\), and let \((Y,d_Y)\) be a complete NPC metric space. In this section, we always assume that \(u:\Omega \rightarrow Y\) is an (energy minimizing) harmonic map. From Lemma 5.3, we can assume that u is continuous on \(\Omega \).

6.1 A family of auxiliary functions with two parameters

Fix any domain \(\Omega '\subset \subset \Omega \). For any \(t>0\) and any \(0\leqslant \lambda \leqslant 1\), we define the following auxiliary function \(f_t(x,\lambda )\) on \(\Omega '\) by:

$$\begin{aligned} f_t(x,\lambda ):=\inf _{y\in \Omega '}\left\{ e^{-2nk\lambda }\cdot \frac{|xy|^2}{2t}-d_Y\big (u(x),u(y)\big )\right\} ,\qquad x\in \Omega '. \end{aligned}$$
(6.1)

We denote by \(S_t(x,\lambda )\) the set of all points where are the “inf” of (6.1) achieved, i.e.,

$$\begin{aligned} S_t(x,\lambda ):=\left\{ y\in \Omega '\ |\ f_t(x,\lambda )=e^{-2nk\lambda }\cdot \frac{|xy|^2}{2t}-d_Y\big (u(x),u(y)\big ) \right\} . \end{aligned}$$

It is clear that (by setting \(y=x\))

$$\begin{aligned} 0\geqslant f_t(x,\lambda )\geqslant -\mathrm{osc}_{\overline{\Omega '}}u :=-\max _{x,y\in \overline{\Omega '}}d_Y\big (u(x),u(y)\big ). \end{aligned}$$
(6.2)

Given a function \(g(x,\lambda )\) defined on \(\Omega \times {\mathbb {R}}\), we always denote by \(g(\cdot ,\lambda )\) the function \(x\mapsto g(x,\lambda )\) on \(\Omega \). The notations \(g(x,\cdot )\) and \(g(\cdot ,\cdot )\) are analogous.

Lemma 6.1

Fix any domain \(\Omega ''\subset \subset \Omega '\) and denote by

$$\begin{aligned} C_*:=2\mathrm{osc}_{\overline{\Omega '}}u+2\quad \mathrm{and}\quad t_0:=\frac{\mathrm{dist}^2(\Omega '',\partial \Omega ')}{4C_*}. \end{aligned}$$

For each \(t\in (0,t_0)\), we have

  1. (i)

    for each \(\lambda \in [0,1]\) and \(x\in \Omega ''\), the set \(S_t(x,\lambda )\not =\varnothing \) and it is closed, and

    $$\begin{aligned} f_t(x,\lambda )=\min _{y\in \overline{B_x(\sqrt{C_*t}})}\left\{ e^{-2nk\lambda }\cdot \frac{|xy|^2}{2t}-d_Y\big (u(x),u(y)\big )\right\} ; \end{aligned}$$
  2. (ii)

    for each \(\lambda \in [0,1]\), the function \(f_t(\cdot ,\lambda )\) is in \(C(\Omega '')\cap W^{1,2}(\Omega '')\), and

    $$\begin{aligned} \int _{\Omega ''}|\nabla f_t(x,\lambda )|^2d{\mathrm{vol}}(x)\leqslant 2\cdot e^{-4nk}\cdot \frac{\mathrm{diam}^2(\Omega ')}{t^2}\cdot {\mathrm{vol}}(\Omega '')+2E^u_{2}(\Omega '');\nonumber \\ \end{aligned}$$
    (6.3)
  3. (iii)

    for each \(x\in \Omega ''\), the function \(f_t(x,\cdot )\) is Lipschitz continuous on [0, 1], and

    $$\begin{aligned} |f_t(x,\lambda )-f_t(x,\lambda ')|\leqslant e^{-2nk}\cdot C_*\cdot |\lambda -\lambda '|,\qquad \forall \lambda ,\lambda '\in [0,1].\qquad \end{aligned}$$
    (6.4)
  4. (iv)

    the function \((x,\lambda )\mapsto f_t(x,\lambda )\) is in \(C\big (\Omega ''\times [0,1]\big )\cap W^{1,2}(\Omega ''\times (0,1))\) with respect to the product measure \(\underline{\nu }:={\mathrm{vol}}\times {\mathcal {L}}^1\), where \({\mathcal {L}}^1\) is the Lebesgue measure on [0, 1].

Proof

(i) Let \(x\in \Omega ''\). The definition of \(C_*\) and \(t_0\) implies that \(B_x(\sqrt{C_*t})\subset \subset \Omega '\). Let \(t\in (0,t_0)\) and \(\lambda \in [0,1]\). Take any a minimizing sequence \(\{y_j\}_j\) of (6.1). We claim that

$$\begin{aligned} |xy_j|^2\leqslant C_*t \end{aligned}$$
(6.5)

for all sufficiently large \(j\in {\mathbb {N}}\). Indeed, from \(f_t(x,\lambda )\leqslant 0\), we get that

$$\begin{aligned} e^{-2nk\lambda }\cdot \frac{|xy_j|^2}{2t}-d_Y\big (u(x),u(y_j)\big )\leqslant 1 \end{aligned}$$

for all sufficiently large \(j\in {\mathbb {N}}.\) Thus,

$$\begin{aligned} |xy_j|^2\leqslant 2t\big (1+d_Y\big (u(x),u(y_j)\big )\big )\leqslant 2t( 1+\mathrm{osc}_{\overline{\Omega '}}u)\leqslant C_*t \end{aligned}$$

for all \(j\in {\mathbb {N}}\) large enough, where we have used that \(k\leqslant 0\) and the definition of \(C_*.\) This proves (6.5). The assertion (i) is implied by the combination of (6.5) and that u is continuous.

(ii) Let \(t\in (0,t_0)\) and \(\lambda \in [0,1]\) be fixed. Take any \(x,y\in \Omega ''\) and let point \(z\in \Omega '\) achieve the minimum in the definition of \(f_t(y,\lambda )\). We have, by the triangle inequality,

$$\begin{aligned} \begin{aligned} f_t(x,\lambda )-f_t(y,\lambda )&\leqslant e^{-2nk\lambda }\cdot \frac{|xz|^2}{2t}-d_Y\big (u(x),u(z)\big )-e^{-2nk\lambda }\cdot \frac{|yz|^2}{2t}\\&\quad +d_Y\big (u(y),u(z)\big )\\&\leqslant e^{-2nk\lambda }\cdot \frac{(|xz|-|yz|)\cdot (|xz|+|yz|)}{2t}+d_Y\big (u(x),u(y)\big ) \\&\leqslant e^{-2nk\lambda }\cdot \frac{\mathrm{diam}(\Omega ')}{t}\cdot |xy|+d_Y\big (u(x),u(y)\big ). \end{aligned} \end{aligned}$$

By the symmetry of x and y, we have

$$\begin{aligned} |f_t(x,\lambda )-f_t(y,\lambda )|\leqslant e^{-2nk\lambda }\cdot \frac{\mathrm{diam}(\Omega ')}{t}\cdot |xy|+d_Y\big (u(x),u(y)\big ). \end{aligned}$$

This inequality implies the following assertions:

  • \(f(\cdot ,\lambda )\) is continuous on \(\Omega ''\), since u is continuous;

  • for any \(\epsilon >0\), the approximating energy density of \(f(\cdot ,\lambda )\) satisfies (since \(e^{-2nk\lambda }\leqslant e^{-2nk}\))

    $$\begin{aligned} e^{f_t(\cdot ,\lambda )}_{2,\epsilon }(x)\leqslant 2 e^{-4nk}\cdot \mathrm{diam}^2(\Omega ')/t^2+2e^u_{2,\epsilon }(x),\qquad x\in \Omega ''. \end{aligned}$$

This implies (6.3), and hence (ii).

(iii) Let any \(x\in \Omega ''\) be fixed. Take any \(\lambda ,\mu \in [0,1]\). Let a point \(z\in S_t(x,\mu )\). That is, point z achieves the minimum in the definition of \(f_t(x,\mu )\). By the triangle inequality, we get

$$\begin{aligned} \begin{aligned} f_t(x,\lambda )-f_t(x,\mu )&\leqslant e^{-2nk\lambda }\cdot \frac{|xz|^2}{2t}-d_Y\big (u(x),u(z)\big )-e^{-2nk\mu }\cdot \frac{|xz|^2}{2t}\\&\quad +d_Y\big (u(x),u(z)\big )\\&\leqslant \frac{ (e^{-2nk\lambda }-e^{-2nk\mu })\cdot |xz|^2}{2t} \\&\leqslant |\lambda -\mu |\cdot e^{-2nk}\cdot \frac{C_*t}{2t}\leqslant e^{-2nk}\cdot C_*\cdot |\lambda -\mu |, \end{aligned} \end{aligned}$$

where we have used \(\lambda ,\mu \leqslant 1\) and \(|xz|\leqslant \sqrt{C_*t}\ \) (since (i)). By the symmetry of \(\lambda \) and \(\mu \), we have

$$\begin{aligned} |f_t(x,\lambda )-f_t(x,\mu )|\leqslant e^{-2nk}\cdot C_*\cdot |\lambda -\mu |. \end{aligned}$$

This completes (iii).

(iv) is a consequence of the combination of Eqs. (6.3) and (6.4), and that \(f_t\) is bounded on \(\Omega ''\times [0,1]\). \(\square \)

Fix any domain \(\Omega ''\subset \subset \Omega '\) and let \(t_0\) be given in Lemma 6.1. For each \(t\in (0,t_0)\) and each \(\lambda \in [0,1]\), the set \(S_t(x,\lambda )\) is closed for all \(x\in \Omega ''\), by Lemma 6.1(i). We define a function \(L_{t,\lambda }(x)\) on \(\Omega ''\) by

$$\begin{aligned} L_{t,\lambda }(x):=\mathrm{dist}\big (x, S_t(x,\lambda )\big )=\min _{y\in S_t(x,\lambda )}|xy|,\quad \ x\in \Omega ''. \end{aligned}$$
(6.6)

Lemma 6.2

Fix any domain \(\Omega ''\subset \subset \Omega '\). For each \(t\in (0,t_0)\), we have:

  1. (i)

    the function \((x,\lambda )\mapsto L_{t,\lambda }(x)\) is lower semi-continuous in \(\Omega ''\times [0,1]\);

  2. (ii)

    for each \(\lambda \in [0,1]\),

    $$\begin{aligned} \Vert L_{t,\lambda }\Vert _{L^\infty (\Omega '')}\leqslant \sqrt{C_*t}, \end{aligned}$$
    (6.7)

    where the constant \(C_*\) is given in Lemma 6.1.

Proof

Let \(x\in \Omega ''\) and \(\lambda \in [0,1]\). We take sequences \(\{(x_j,\lambda _j)\}_j\subset \Omega ''\times [0,1]\) with \((x_j,\lambda _j)\rightarrow (x,\lambda )\), as \(j\rightarrow \infty \), such that

$$\begin{aligned} \lim _{j\rightarrow \infty } L_{t,\lambda _j}(x_j)=\liminf _{z\rightarrow x,\ \mu \rightarrow \lambda }L_{t,\mu }(z). \end{aligned}$$

For each j, let \(y_j\in S_t(x_j,\lambda _j)\) such that \(L_{t,\lambda _j}(x_j)=|x_jy_j|.\) Since \(\mathrm{dist}(y_j,\Omega '')\leqslant \sqrt{C_*t_0}= \mathrm{dist}(\Omega '',\partial \Omega ')/2\) for all \(j\in {\mathbb {N}}\) (by Lemma 6.1(i)), there exists a subsequence, say \(\{y_{j_l}\}_l\), converging to some \(y\in \Omega '\). By the continuity of u and \(f_t(\cdot ,\lambda )\) (see Lemma 6.1(iv)), we get

$$\begin{aligned} f_t(x,\lambda )=e^{-2nk\lambda }\cdot \frac{|xy|^2}{2t}-d_Y\big (u(x),u(y)\big ). \end{aligned}$$

This implies \(y\in S_t(x,\lambda )\). From the definition of \(L_{t,\lambda }(x)\), we have

$$\begin{aligned} L_{t,\lambda }(x)\leqslant |xy|=\lim _{l\rightarrow \infty }|x_{j_l}y_{j_l}|=\lim _{l\rightarrow \infty } L_{t,\lambda _j}(x_{j_l})=\liminf _{z\rightarrow x,\ \mu \rightarrow \lambda }L_{t,\lambda }(z). \end{aligned}$$

Therefore, \(L_{t,\lambda }\) is lower semi-continuous on \(\Omega ''\times [0,1]\). The proof of (i) is complete.

For each \(t\in (0,t_0)\) and each \(\lambda \in [0,1]\), the function \( L_{t,\lambda }(\cdot )\) is lower semi-continuous, and hence it is measurable, on \(\Omega ''\). By Lemma 6.1(i) and the definition of \( L_{t,\lambda }\), we have \(0\leqslant L_{t,\lambda }(x)\leqslant \sqrt{C_*t}\) for all \(x\in \Omega ''\). Hence, the estimate (6.7) holds. This completes the proof of the lemma. \(\square \)

Lemma 6.3

Fix any domain \(\Omega ''\subset \subset \Omega '\). For each \(t\in (0,t_0)\), we have

$$\begin{aligned} \liminf _{\mu \rightarrow 0^+}\frac{f_t(x,\lambda +\mu )-f_t(x,\lambda )}{\mu }\geqslant -e^{-2nk\lambda }\cdot \frac{nk}{t}\cdot L^2_{t,\lambda }(x) \end{aligned}$$

for any \(\lambda \in [0,1)\) and \(x\in \Omega ''\).

Consequently, we have, for each \(x\in \Omega ''\), (by Lemma 6.1(iii))

$$\begin{aligned} \frac{\partial f_t(x,\lambda )}{\partial \lambda }\geqslant -e^{-2nk\lambda }\cdot \frac{nk}{t}\cdot L^2_{t,\lambda }(x)\qquad {\mathcal {L}}^1\mathrm{-}a.e.\ \ \ \lambda \in (0,1). \end{aligned}$$
(6.8)

Proof

Let \(t\in (0,t_0)\), \(\lambda \in [0,1)\) and \(x\in \Omega ''.\) For each \(0<\mu <1-\lambda \), we take a point \(y_{\lambda +\mu }\in S_t(x,\lambda +\mu )\). By the definition of \(f_t(x,\lambda )\) and \(S_t(x,\lambda )\), we have

$$\begin{aligned} \begin{aligned}&f_t(x,\lambda +\mu )-f_t(x,\lambda )\\&\quad =e^{-2nk(\lambda +\mu )} \frac{|xy_{\lambda +\mu }|^2}{2t}-d_Y\big (u(x),u(y_{\lambda +\mu })\big )\\&\qquad -\inf _{z}\left\{ e^{-2nk\lambda } \frac{|xz|^2}{2t}-d_Y\big (u(x),u(z)\big )\right\} \\&\quad \geqslant \left( e^{-2nk(\lambda +\mu )}-e^{-2nk\lambda }\right) \cdot \frac{|xy_{\lambda +\mu }|^2}{2t}\\&\quad \geqslant \left( e^{-2nk(\lambda +\mu )}-e^{-2nk\lambda }\right) \cdot \frac{L^2_{t,\lambda +\mu }(x)}{2t}, \end{aligned} \end{aligned}$$

where we have used \(k\leqslant 0.\) By the lower semi-continuity of \(L_{t,\lambda }\), we have

$$\begin{aligned} \liminf _{\mu \rightarrow 0^+}\frac{f_t(x,\lambda +\mu )-f_t(x,\lambda )}{\mu }\geqslant e^{-2nk\lambda }\cdot (-nk)\cdot \frac{L^2_{t,\lambda }(x)}{t}. \end{aligned}$$

This proves the lemma.\(\square \)

We need a mean value inequality.

Lemma 6.4

Given any \(z\in \Omega \) and \(P\in Y\), we define a function \(w_{z,P}\) by

$$\begin{aligned} w_{z,P}(\cdot ):= d^2_Y\big (u(\cdot ),u(z)\big ) -d^2_Y\big (u(\cdot ),P\big )+ d^2_Y\big (P,u(z)\big ). \end{aligned}$$

Then, there exists a sequence \(\{\varepsilon _j\}_j\) converging to 0 and a set \({\mathscr {N}}\) with \({\mathrm{vol}}({\mathscr {N}})=0\) such that the following property holds: given any \(x_0\in \Omega \backslash {\mathscr {N}}\) and any \(P\in Y\), the following mean value inequalities

$$\begin{aligned} \int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}w_{x_0,P}\big (\exp _{x_0}(\eta )\big )d\eta \leqslant o\left( \varepsilon _j^{n+2}\right) \end{aligned}$$
(6.9)

hold for any set \({\mathscr {W}}\subset {\mathscr {W}}_{x_0}\) satisfying

$$\begin{aligned} \frac{H^n\big ({\mathscr {W}}\cap B_o(\varepsilon _j)\big )}{H^n\big (B_o(\varepsilon _j)\subset T_{x_o}\big )} \geqslant 1-o(\varepsilon _j). \end{aligned}$$
(6.10)

Proof

We firstly show that there exists a sequence \(\{\varepsilon _j\}_j\) converging to 0 and a set \({\mathscr {N}}\) with \({\mathrm{vol}}({\mathscr {N}})=0\) such that the following property holds: for any \(x_0\in \Omega \backslash {\mathscr {N}}\) and any \(P\in Y\), we have

$$\begin{aligned} \int _{B_{x_0}(\varepsilon _j)}w_{x_0,P}(x)d{\mathrm{vol}}(x)\leqslant o\left( \varepsilon _j^{n+2}\right) . \end{aligned}$$
(6.11)

This comes from the combination of Corollaries 4.7 and 5.6. Indeed, on the one hand, by applying Corollary 4.7 with \(p=2\) to the sequence \(\{\epsilon _j=j^{-1}\}_{j=1}^\infty \), we conclude that there exists a subsequence \(\{\varepsilon _j\}_j\subset \{\epsilon _j\}_j\) and a set \(N_1\) with \({\mathrm{vol}}(N_1)=0\) such that for any point \(x_0\in \Omega \backslash N_1\), we have

$$\begin{aligned} \begin{aligned}&\int _{B_{x_0}(\varepsilon _j)}d^2_Y\big (u(x_0),u(x)\big )d{\mathrm{vol}}(x)\\&\quad =\frac{\omega _{n-1}}{n(n+2)}|\nabla u|_2(x_0)\cdot \varepsilon _j^{n+2}+o\left( \varepsilon _j^{n+2}\right) , \end{aligned} \end{aligned}$$
(6.12)

where we have used \(c_{n,2}=\omega _{n-1}/n\). On the other hand, from Corollary 5.6, there exists a set \(N_2\) with \({\mathrm{vol}}(N_2)=0\) such that, for all \(x_0\in \Omega \backslash N_2\), we have

$$\begin{aligned} \begin{aligned}&\int _{B_{x_0}(\varepsilon _j)}\Big [ d^2_Y\big (P,u(x_0)\big )-d^2_Y\big (P,u(x)\big )\Big ] d{\mathrm{vol}}(x)\\&\quad \leqslant - \frac{|\nabla u|_2(x_0)\cdot \omega _{n-1}}{n(n+2)}\cdot \varepsilon _j^{n+2}+o\left( \varepsilon _j^{n+2}\right) \end{aligned} \end{aligned}$$
(6.13)

for every \(P\in Y\). Now, denote by \({\mathscr {N}}=N_1\cup N_2\). The Eq. (6.11) follows from the combination of the definition of function \(w_{x_0,P}\) and (6.12)–(6.13).

According to [45], the set of smooth points has full measure in M. Then, without loss the generality, we can assume that \(x_0\) is smooth. By Theorem 5.5, we can also assume that \(\mathrm{Lip}u(x_0)<+\infty \).

Since the point \(x_0\) is smooth, by using Lemma 2.5, we have

$$\begin{aligned} \begin{aligned}&\int _{B_{o}(\varepsilon _j)\cap {\mathscr {W}}_{x_0}}w_{x_0,P}\big (\exp _{x_0}(\eta )\big )dH^n(\eta )\\&\quad =\int _{B_{x_0}(\varepsilon _j)\cap W_{x_0}}w_{x_0,P}(x)\cdot \big (1+o(\varepsilon _j)\big )d{\mathrm{vol}}(x)\\&\quad \leqslant \int _{B_{x_0}(\varepsilon _j)}w_{x_0,P}(x)d{\mathrm{vol}}(x)+\int _{B_{x_0}(\varepsilon _j)}|w_{x_0,P}(x)|\cdot o(\varepsilon _j)d{\mathrm{vol}}(x). \end{aligned} \end{aligned}$$
(6.14)

Here we have used that \(W_{x_0}\) has full measure in M [43]. Since \(\mathrm{Lip}u(x_0)<+\infty \), we have, for \(x\in B_{x_0}(\varepsilon _j)\),

$$\begin{aligned} d^2_Y\big (u(x),u(x_0)\big )\leqslant \mathrm{Lip}^2u(x_0)\cdot \varepsilon _j^2+o\left( \varepsilon _j^2\right) . \end{aligned}$$

By combining with the definition of function \(w_{x_0,P}\) and (5.14), we get

$$\begin{aligned} |w_{x_0,P}(x)|\leqslant O(\varepsilon _j), \qquad \forall \ x\in B_{x_0}(\varepsilon _j). \end{aligned}$$
(6.15)

The combination of (6.11), (6.14) and (6.15) implies that

$$\begin{aligned} \int _{B_{o}(\varepsilon _j)\cap {\mathscr {W}}_{x_0}}w_{x_0,P}\big (\exp _{x_0}(\eta )\big )dH^n(\eta )&\leqslant o\left( \varepsilon _j^{n+2}\right) \nonumber \\&\quad +O(\varepsilon _j)\cdot o(\varepsilon _j)\cdot {\mathrm{vol}}(B_{x_0}(\varepsilon _j))\nonumber \\&= o\left( \varepsilon _j^{n+2}\right) . \end{aligned}$$
(6.16)

Given any set \({\mathscr {W}}\subset {\mathscr {W}}_{x_0}\) satisfying Eq. (6.10), we obtain

$$\begin{aligned} \begin{aligned}&\Big |\int _{B_{o}(\varepsilon _j)\cap ({\mathscr {W}}_{x_0}\backslash {\mathscr {W}})}w_{x_0,P}\big (\exp _{x_0}(\eta )\big )dH^n(\eta )\Big |\\&\quad \overset{(6.15)}{\leqslant } O(\varepsilon _j)\cdot H^n\big ( B_{o}(\varepsilon _j)\cap ({\mathscr {W}}_{x_0}\backslash {\mathscr {W}})\big )\\&\quad \ \leqslant O(\varepsilon _j)\cdot H^n\big ( B_{o}(\varepsilon _j)\backslash {\mathscr {W}}\big )\\&\quad \overset{(6.10)}{\leqslant } O(\varepsilon _j)\cdot o(\varepsilon _j)\cdot H^n\big ( B_{o}(\varepsilon _j)\big )\\&\quad \ =o\left( \varepsilon ^{n+2}_j\right) . \end{aligned} \end{aligned}$$
(6.17)

The combination of Eqs. (6.16) and (6.17) implies the Eq. (6.9). Hence we have completed the proof. \(\square \)

The following two lemmas were stated by Petrunin [50], and their detailed proofs were given in [58].

Lemma 6.5

(Petrunin [50], see also Lemma 4.15 in [58]) Let h be the Perelman’s concave function given in Proposition 2.7 on a neighborhood \(U\subset M\). Assume that f is a semi-concave function defined on U. And suppose that \(u\in W^{1,2}(U )\cap C(U )\) satisfies \({\mathscr {L}}_u\leqslant \lambda \cdot {\mathrm{vol}}\) on U for some constant \(\lambda \in {\mathbb {R}}\).

We assume that point \(x^*\in U\) is a minimal point of function \(u+f+h\), then \(x^*\) has to be regular.

The second lemma is Petrunin’s perturbation in [50]. We need some notations. Let \(u\in W^{1,2}(D)\cap C(\overline{D})\) satisfy \({\mathscr {L}}_u\leqslant \lambda \cdot {\mathrm{vol}}\) on a bounded domain D. Suppose that \(x_0\) is the unique minimum point of u on D and

$$\begin{aligned} u(x_0)<\min _{x\in \partial D}u. \end{aligned}$$

Suppose also that \(x_0\) is regular and \(g=(g_1,\ g_2, \ldots g_n): D\rightarrow {\mathbb {R}}^n\) is a coordinate system around \(x_0\) such that g satisfies the following:

  1. (i)

    g is an almost isometry from D to \(g(D)\subset {\mathbb {R}}^n\) (see [5]). Namely, there exists a sufficiently small number \(\delta _0>0\) such that

    $$\begin{aligned} \Big |\frac{\Vert g(x)-g(y)\Vert }{|xy|}-1\Big |\le \delta _0,\qquad \mathrm{for\ all}\quad x,y \in D, \ x\not =y; \end{aligned}$$
  2. (ii)

    all of the coordinate functions \(g_j,\ 1\leqslant j\leqslant n,\) are concave [44].

    Then there exists \(\epsilon _0>0\) such that, for each vector \(V=(v^1,v^2,\ldots ,v^n)\in {\mathbb {R}}^{n}\) with \(|v^j|\leqslant \epsilon _0\) for all \(1\leqslant j\leqslant n\), the function

    $$\begin{aligned} G(V,x):=u(x)+V\cdot g(x) \end{aligned}$$

    has a minimum point in the interior of D, where \(\cdot \) is the Euclidean inner product of \({\mathbb {R}}^n\) and \(V\cdot g(x)=\sum _{j=1}^nv^{j}g_j(x)\).

Let

$$\begin{aligned} {\mathscr {U}}=\{V\in {\mathbb {R}}^{n}:\ |v^j|<\epsilon _0,\ 1\leqslant j\leqslant n\} \subset {\mathbb {R}}^n. \end{aligned}$$

We define \( \rho : {\mathscr {U}}\rightarrow D\) by setting

$$\begin{aligned} \rho (V)\hbox { to be one of minimum point of }G(V,x). \end{aligned}$$

Note that the map \(\rho \) might not be uniquely defined.

Lemma 6.6

(Petrunin [50], see also Lemma 4.16 in [58]) Let \(u,\ x_0,\) \(\{g_j\}_{j=1}^n\) and \(\rho \) be as above. There exists some \(\epsilon \in (0,\epsilon _0)\) such that for arbitrary \(\epsilon '\in (0,\epsilon )\), the image \(\rho ({\mathscr {U}}^+_{\epsilon '})\) has nonzero Hausdorff measure, where

$$\begin{aligned} {\mathscr {U}}_{\epsilon '}^+:=\{ V=(v_1,v_2,\ldots ,v_n)\in {\mathbb {R}}^n: \ 0<v^j<\epsilon ' \ \ \hbox {for all}\ \ 1\leqslant j\leqslant n\}. \end{aligned}$$

Consequently, given any set \(A\subset D\) with full measure, then for any \(\epsilon '<\epsilon \), there exists \(V\in {\mathscr {U}}_{\epsilon '}^+\) such that the function \(u(x)+V\cdot g(x)\) has a minimum point in A.

Proof

The first assertion is the result of Lemma 4.16 in [58]. The second assertion is implied obviously by the first one. \(\square \)

The following lemma is the key for us to prove that \(f_t(x,\lambda )\) is a super-solution of the heat equation.

Lemma 6.7

Given any point \(p\in \Omega '\), there exits a neighborhood \(U_p (=B_p(R_p))\) of p and a constant \(t_p>0\) such that, for each \(t\in (0,t_p)\) and each \(\lambda \in [0,1]\), the function \(x\mapsto f_t(x,\lambda )\) is a super-solution of the Poisson equation

$$\begin{aligned} {\mathscr {L}}_{f_t(x,\lambda )}= -e^{-2nk\lambda }\cdot \frac{nk}{t}L^2_{t,\lambda }(x)\cdot {\mathrm{vol}}\end{aligned}$$
(6.18)

on \(U_p.\)

Proof

Let \(U_p=B_p(R_p)\subset \subset \Omega '\) be a neighborhood of p such that \(U=B_p(2R_p)\) supports a Perelman’s concave function h (see Proposition 2.7). Suppose that \(t_p= R_p^2/(2C_*)\), where \(C_*\) is given in Lemma 6.1. Now, for each \(t\in (0,t_p)\), we have \(\varnothing \not =S_t(x,\lambda )\subset \subset U\) for any \((x,\lambda )\in U_p\times [0,1]\), by Lemma 6.1(i).

To prove the lemma, it suffices to prove the following claim.

Claim

For each \(t\in (0,t_p)\) and each \(\lambda \in [0,1]\), the function \(x\mapsto f_t(x,\lambda )\) is a super-solution of the Poisson equation

$$\begin{aligned} {\mathscr {L}}_{f_t(x,\lambda )}= \Big (-e^{-2nk\lambda }\cdot \frac{nk}{t}L^2_{t,\lambda }(x)+\theta \Big )\cdot {\mathrm{vol}}\quad on \ \ U_p \end{aligned}$$

for and any \(\theta >0\).

We will divide the argument into four steps, as we did in the proof of Proposition 5.3 in [58]. However, the method is used in the crucial fourth step there, is not available for our auxiliary functions \(f_t(x,\lambda )\) in this paper. Here we will use a new idea in the fourth step via the previous mean value inequalities given in Lemma 6.4.

Step 1. Setting up a contradiction argument.

Suppose that the Claim fails for some \(t\in (0,t_p)\), \(\lambda \in [0,1]\) and some \(\theta _0>0\). According to Corollary 3.5, there exists a domain \(B\subset \subset U_p\) such that the function \(f_{t}(\cdot ,\lambda )-v(\cdot )\) satisfies

$$\begin{aligned} \min _{x\in B}\big (f_{t}(x,\lambda )-v(x)\big )<0=\min _{x\in \partial B}\big (f_{t}(x,\lambda )-v(x)\big ), \end{aligned}$$

where v is the (unique) solution of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathscr {L}}_{v}&{}=\Big (-e^{-2nk\lambda }\cdot \frac{nk}{t}L^2_{t,\lambda }+\theta _0\Big )\cdot {\mathrm{vol}}\quad \mathrm{in}\ \ B\\ v&{}=f_{t}(\cdot ,\lambda ) \quad \mathrm{on}\ \ \partial B. \end{array}\right. } \end{aligned}$$

In this case we say that \(f_{t}(\cdot ,\lambda )-v(\cdot )\) has a strict minimum in the interior of B.

Let us define a function H(xy) on \(B\times U\), similar as in [50, 58], by

$$\begin{aligned} H(x,y):=\frac{e^{-2nk\lambda }}{2t}\cdot |xy|^2-d_Y\big (u(x),u(y)\big )-v(x). \end{aligned}$$

Let \({\overline{x}}\in B\) be a minimum of \(f_{t}(\cdot ,\lambda )-v\) on B, and let \({\overline{y}}\in S_t({\overline{x}},\lambda )\) \((\subset \subset U)\) such that

$$\begin{aligned} |{\overline{x}}{\overline{y}}|=L_{t,\lambda }({\overline{x}}). \end{aligned}$$
(6.19)

By the definition of \(S_t({\overline{x}},\lambda )\), H(xy) has a minimum at \(({\overline{x}},{\overline{y}})\).

Let us fix a real number \(\delta _0\) with

$$\begin{aligned} 0<\delta _0\leqslant \frac{\theta _0}{8n(1+\sqrt{-k}\cdot \mathrm{diam}U)}, \end{aligned}$$
(6.20)

and consider the function

$$\begin{aligned} H_0(x,y):=H(x,y)+\delta _0 |{\overline{x}}x|^2+\delta _0|{\overline{y}}y|^2,\quad (x,y)\in B\times U. \end{aligned}$$

Since \(({\overline{x}},{\overline{y}})\) is one of the minimal points of H(xy), we conclude that it is the unique minimal point of \(H_0(x,y).\)

Step 2. Petrunin’s argument of perturbation.

In this step, we will perturb the above function \(H_0\) to achieve some minimum at a smooth point.

Recall the Perelman’s concave function h is 2-Lipschitz on U (see Proposition 2.7). Then, for any sufficiently small number \(\delta _1>0\), the function

$$\begin{aligned} H_1(x,y):=H_0(x,y)+\delta _1h(x)+\delta _1h(y) \end{aligned}$$

also achieves its a strict minimum in the interior of \(B\times U\). Let \((x^*,y^*)\) denote one of minimal points of \(H_1(x,y)\).

(i) We first claim that both points \(x^*\) and \(y^*\) are regular.

To justify this, we consider the function on B

$$\begin{aligned} \begin{aligned} H_1(x,y^*)&=H_0(x,y^*)+\delta _1h(x)+\delta _1h(y^*)\\&=e^{-2nk\lambda }\cdot \frac{|xy^*|^2}{2t}-d_Y\big (u(x),u(y^*)\big )-v(x)+\delta _0 |{\overline{x}}x|^2\\&\quad +\delta _0|{\overline{y}}y^*|^2 +\delta _1h(x)+\delta _1h(y^*). \end{aligned} \end{aligned}$$

From the first paragraph of the proof of Proposition 5.4, we have

$$\begin{aligned} {\mathscr {L}}_{d_Y\big (u(x),u(y^*)\big ) }\geqslant 0. \end{aligned}$$

Notice that \({\mathscr {L}}_v=-nk\cdot e^{-2nk\lambda }\cdot L^2_{t,\lambda }/t+\theta _0\in L^\infty (B)\) (since Lemma 6.2(ii)) and \(|{\overline{x}} x|^2, |xy^*|^2/(2t)\) is semi-concave on B. Notice also that \(x^*\) is a minimun of \(H_1(x,y^*)\). We can use Lemma 6.5 to conclude that \(x^*\) is regular. Using the same argument to function \(H_1(x^*,y)\), we can get that \(y^*\) is also regular.

Consider the function

$$\begin{aligned} H_2(x,y):=H_1(x,y)+ \delta _1\cdot |xx^*|^2+ \delta _1\cdot |yy^*|^2 \end{aligned}$$

on \(B\times U\). It has the unique minimal point at \((x^*,y^*)\).

(ii) We will use Lemma 6.6 to perturb the function \(H_2\) to achieve some minimum at a smooth point.

Firstly, we want to show that

$$\begin{aligned} {\mathscr {L}}^{(2)}_{H_2}\leqslant C(M,t,\lambda ,\delta _1,\delta _0,\Vert L_{t,\lambda }\Vert _{L^\infty (B)}) \end{aligned}$$
(6.21)

for some constant \(C(M,t,\delta _1,\delta _0,\Vert L_{t,\lambda }\Vert _{L^\infty (B)})\), where \({\mathscr {L}}^{(2)}\) is the Laplacian on \(B\times U.\)

Note that

$$\begin{aligned} |xy|^2=2\cdot \mathrm{dist}^2_{D_M}(x,y), \end{aligned}$$

where \(\mathrm{dist}_{D_M}(\cdot )\) is the distance function from the diagonal set \(D_M :=\{(x,x):\ x\in M\}\) on \(M\times M.\) Thus we know that \(|xy|^2\) is a semi-concave function on \(M\times M\). The function \(|{\overline{x}}x|^2+|{\overline{y}}y|^2\) is also semi-concave on \(M\times M\), because

$$\begin{aligned} |{\overline{x}}x|^2+|{\overline{y}}y|^2=|(x,y)({\overline{x}},{\overline{y}})|^2_{M\times M}. \end{aligned}$$

The function \(|xx^*|^2+|yy^*|^2\) is semi-concave on \(M\times M\) too. By combining these with the concavity of \(h(x)+h(y)\) on \(U\times U\) and the sub-harmonicity of \(d_Y\big (u(x),u(y)\big ) \) on \(U\times U\) (see Proposition 5.4), and that \({\mathscr {L}}_v=-nk\cdot e^{-2nk\lambda }\cdot L^2_{t,\lambda }/t+\theta _0\in L^\infty (B)\) (since Lemma 6.2(ii)), we obtain (6.21).

Since \((x^*,y^*)\) is regular in \(M\times M\), by [5] and [45], we can choose a nearly orthogonal coordinate system near \(x^*\) by concave functions \(g_1,g_2,\ldots , g_n\) and another nearly orthogonal coordinate system near \(y^*\) by concave functions \(g_{n+1},g_{n+2},\ldots , g_{2n}.\) Now, the point \((x^*,y^*)\), the function \(H_2\) and system \(\{g_i\}_{1\leqslant i\leqslant 2n}\) meet all of conditions in Lemma 6.6.

Meanwhile, according to Lemma 6.4, there exists a sequence \(\{\varepsilon _j\}_j\) converging to 0 and a set \({\mathscr {N}}\) with \({\mathrm{vol}}({\mathscr {N}})=0\) such that for all points \((x_0,y_0)\in (\Omega \backslash {\mathscr {N}})\times (\Omega \backslash {\mathscr {N}})\), the mean value inequalities (6.9) hold for functions \(w_{x_0,P}\) and \(w_{y_0,Q}\) for any \(P, Q\in Y\) and any corresponding sets satisfying (6.10) (please see Lemma 6.4 for the definition of functions \(w_{x_0,P}\) and \(w_{y_0,Q}\)). From now on, fixed such a sequence \(\{\varepsilon _j\}_j\).

Hence, by applying Lemma 6.6, there exist arbitrarily small positive numbers \(b_1,b_2,\ldots ,b_{2n}\) such that the function

$$\begin{aligned} H_3(x,y):=H_2(x,y)+\sum ^n_{i=1}b_ig_i(x)+\sum ^{2n}_{i=n+1}b_ig_i(y) \end{aligned}$$

achieves a minimal point \((x^o,y^o)\in B\times U\), which satisfies the following properties:

  1. 1.

    \(x^o\not =y^o\);

  2. 2.

    both \(x^o\) and \(y^o\) are smooth;

  3. 3.

    geodesic \(x^oy^o\) can be extended beyond \(x^o\) and \(y^o\);

  4. 4.

    point \(x^o\) is a Lebesgue point of \( e^{-2nk\lambda }\cdot \frac{-nk}{t}L^2_{t,\lambda }+\theta _0\);

  5. 5.

    the mean value inequalities (6.9) hold for functions \(w_{x^o,P}\) and \(w_{y^o,Q}\) for any \(P,Q\in Y\) and any corresponding sets satisfying (6.10).

Indeed, according to Lemma 6.4 and noting that the set of smooth points has full measure, it is clear that the set of points satisfying the above (1)–(5) has full measure on \(B\times U.\)

Step 3. Second variation of arc-length.

In this step, we will study the second variation of the length of geodesics near the geodesic \(x^oy^o\).

Since M has curvature \(\geqslant k\) and the geodesic \(x^oy^o\) can be extended beyond \(x^o\) and \(y^o\), by the Petrunin’s second variation (Proposition 2.3), there exists an isometry \(T: T_{x^o}\rightarrow T_{y^o}\) and a subsequence of \(\{\varepsilon _j\}_j\) given in Step 2, denoted by \(\{\varepsilon _j\}_j\) again, such that

$$\begin{aligned} {\mathscr {F}}_j(\eta )\leqslant -k|\eta |^2\cdot |x^oy^o|^2+ o(1) \end{aligned}$$
(6.22)

for any \(\eta \in T_{x^o}\), where the function \({\mathscr {F}}_j\) is defined by

$$\begin{aligned} {\mathscr {F}}_j(\eta ):= \frac{|\exp _{x^o}(\varepsilon _j\cdot \eta )\ \exp _{y^o}(\varepsilon _j\cdot T\eta )|^2-|x^oy^o|^2}{\varepsilon _j^2} \end{aligned}$$

if \(\eta \in T_{x^o}\) such that \(\varepsilon _j\cdot \eta \in {\mathscr {W}}_{x^o}\) and \(\varepsilon _j\cdot T\eta \in {\mathscr {W}}_{y^o}\), and \( {\mathscr {F}}_j(\eta ):=0\) if otherwise.

Now we claim that

$$\begin{aligned} \int _{B_o(1)}{\mathscr {F}}_j(\eta )dH^n(\eta )\leqslant \frac{-k\cdot \omega _{n-1}}{n+2}\cdot |x^oy^o|^2+ o(1). \end{aligned}$$
(6.23)

Indeed, by setting z is the mid-point of \(x^o\) and \(y^o\) and using the semi-concavity of distance function \(\mathrm{dist}_z\), we conclude

$$\begin{aligned} |z\exp _{x^o}(\varepsilon _j\cdot \eta )|\leqslant |zx^o|+\left<{\uparrow _{x^o}^z},{\eta }\right>\cdot \varepsilon _j+\sigma _1\cdot |\eta |^2\cdot \varepsilon ^2_j \end{aligned}$$

and

$$\begin{aligned} |z\exp _{y^o}(\varepsilon _j\cdot T\eta )|\leqslant |zy^o|+\left<{\uparrow _{y^o}^z},{T\eta }\right>\cdot \varepsilon _j+\sigma _2\cdot |\eta |^2\cdot \varepsilon ^2_j \end{aligned}$$

for any \(\eta \in T_{x^o}\) such that \(\varepsilon _j\cdot \eta \in {\mathscr {W}}_{x^o}\) and \(\varepsilon _j\cdot T\eta \in {\mathscr {W}}_{y^o}\), where \(\sigma _1,\sigma _2\) are some positive constants depending only on \(|x^oz|,|y^oz|\) and k. By applying the triangle inequality and \(\uparrow _{y^o}^z=-T(\uparrow _{x^o}^z),\) we get (note that \(|x^oz|=|y^oz|=|x^oy^o|/2\)),

$$\begin{aligned} \begin{aligned} {\mathscr {F}}_j(\eta )&\leqslant \frac{\big (|z\exp _{x^o}(\varepsilon _j\cdot \eta )|+ |z\exp _{y^o}(\varepsilon _j\cdot T\eta )|\big )^2-|x^oy^o|^2}{\varepsilon ^2_j}\\&\leqslant 2(\sigma _1+\sigma _2)\cdot |\eta |^2\cdot |x^oy^o| +(\sigma _1+\sigma _2)^2\cdot |\eta |^4\cdot \varepsilon _j^2\\&\leqslant \sigma _3 \end{aligned} \end{aligned}$$

for any \(\eta \in B_o(1)\subset T_{x^o}\), where \(\sigma _3\) is some positive constant depending only on \(|x^oz|,|y^oz|\) and k. That is, \({\mathscr {F}}_j\) is bounded from above in \(B_o(1)\) uniformly. According to Fatou’s Lemma, (6.22) implies

$$\begin{aligned} \limsup _{j\rightarrow \infty }\int _{B_o(1)}{\mathscr {F}}_j(\eta )dH^n(\eta )\leqslant (-k)\int _{B_o(1)}|x^oy^o|^2|\eta |^2dH^n(\eta )=\frac{-k\cdot \omega _{n-1}}{n+2}\cdot |x^oy^o|^2. \end{aligned}$$

This is the desired (6.23). Therefore, by the definition of function \({\mathscr {F}}_j\), we have

(6.24)

where \({\mathscr {W}}:={\mathscr {W}}_{x^o}\cap T^{-1}({\mathscr {W}}_{y^o})=\big \{v\in T_{x^o}:\ v\in {\mathscr {W}}_{x^o}\ \ \mathrm{and}\ \ Tv \in {\mathscr {W}}_{y^o}\big \}.\)

Step 4. Maximum principle via mean value inequalities.

Let us fix the sequence of numbers \(\{\varepsilon _j\}_j\) as in the above Step 2 and Step 3, and fix the isometry \(T: T_{x^o}\rightarrow T_{y^o}\) and the set \({\mathscr {W}}:={\mathscr {W}}_{x^o}\cap T^{-1}({\mathscr {W}}_{y^o})\) as in Step 3.

Recall that in Step 2, we have proved that the function

$$\begin{aligned} H_3(x,y)=\frac{e^{-2nk\lambda }}{2t}\cdot |xy|^2-d_Y\big (u(x),u(y)\big )-v(x)+{\widetilde{\gamma }}_1(x)+{\widetilde{\gamma }}_2(y) \end{aligned}$$

has a minimal point \((x^o,y^o)\) in the interior of \( B\times U\), where both \(x^o\) and \(y^o\) are smooth points, and the functions

$$\begin{aligned} \begin{aligned} {\widetilde{\gamma }}_1(x)&:=\delta _0\cdot |{\overline{x}}x|^2+\delta _1\cdot h(x)+\frac{\delta _1}{8}|x^*x|^2+ \sum _{i=1}^nb_i\cdot g_i(x),\\ \mathrm{and}\quad \qquad {\widetilde{\gamma }}_2(y)&:=\delta _0\cdot |{\overline{y}}y|^2+\delta _1\cdot h(y)+\frac{\delta _1}{8}|y^*y|^2+ \sum _{i=n+1}^{2n}b_i\cdot g_i(y).\qquad \qquad \end{aligned} \end{aligned}$$

Consider the mean value

$$\begin{aligned} I(\varepsilon _j):&=\int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}\Big [H_3\big (\exp _{x^o}(\eta ),\exp _{y^o}(T\eta )\big )-H_3(x^o,y^o)\Big ]dH^n(\eta )\nonumber \\&=I_1(\varepsilon _j)-I_2(\varepsilon _j)-I_3(\varepsilon _j)+I_4(\varepsilon _j)+I_5(\varepsilon _j), \end{aligned}$$
(6.25)

where

$$\begin{aligned} \begin{aligned} I_1(\varepsilon _j)&:=\frac{e^{-2nk\lambda }}{2t}\cdot \int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}\Big (|\exp _{x^o}( \eta )\ \exp _{y^o}( T\eta )|^2-|x^oy^o|^2\Big )dH^n(\eta ),\\ I_2(\varepsilon _j)&:= \int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}\Big (d_Y\big (u(\exp _{x^o}( \eta )),u(\exp _{y^o}( T\eta )\big )\\&\quad -d_Y\big (u(x^o),u(y^o)\big )\Big )dH^n(\eta ),\\ I_3(\varepsilon _j)&:= \int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}\Big (v(\exp _{x^o}( \eta ))-v(x^o)\Big )dH^n(\eta ),\\ I_4(\varepsilon _j)&:= \int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}\Big ({\widetilde{\gamma }}_1(\exp _{x^o}( \eta ))-{\widetilde{\gamma }}_1(x^o)\Big )dH^n(\eta ),\\ I_5(\varepsilon _j)&:= \int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}\Big ({\widetilde{\gamma }}_2(\exp _{y^o}(T \eta ))-{\widetilde{\gamma }}_2(y^o)\Big )dH^n(\eta ). \end{aligned} \end{aligned}$$

The minimal property of point \((x^o,y^o)\) implies that

$$\begin{aligned} I(\varepsilon _j)\geqslant 0. \end{aligned}$$
(6.26)

We need to estimate \(I_1,I_2,I_3, I_4\) and \(I_5\). Recall that the integration \(I_1\) has been estimated by (6.24).

(i) The estimate of \(I_2\).

By applying Lemma 5.2 for points

$$\begin{aligned} P=u\left( \exp _{x^o}(\eta )\right) ,\ \ Q=u(x^o),\ \ R=u(y^o)\ \ \mathrm{and}\ \ S=u\left( \exp _{y^o}(T\eta )\right) , \end{aligned}$$

we get

$$\begin{aligned} \begin{aligned}&\Big (d_Y\big (u(\exp _{x^o}( \eta )),u(\exp _{y^o}( T\eta )\big )-d_Y\big (u(x^o),u(y^o)\big )\Big )\cdot d_Y\big (u(x^o),u(y^o)\big )\\&\quad \geqslant \big (d^2_{PQ_m}-d^2_{PQ}-d^2_{Q_mQ}\big )+\big (d^2_{SQ_m}-d^2_{SR}-d^2_{Q_mR}\big )\\&\quad =-w_{x^o,Q_m}\big (\exp _{x^o}(\eta )\big )-w_{y^o,Q_m}\big (\exp _{y^o}(T\eta )\big ), \end{aligned} \end{aligned}$$
(6.27)

where \(Q_m\) the mid-point of \(u(x^o)\) and \(u(y^o)\), and the function \(w_{z,Q_m}\) is defined in Lemma 6.4, namely,

$$\begin{aligned} w_{z,Q_m}(\cdot ):= d^2_Y\big (u(\cdot ),u(z)\big ) -d^2_Y\big (u(\cdot ),Q_m\big )+ d^2_Y\big (Q_m,u(z)\big ). \end{aligned}$$

Now we want to show that the set \({\mathscr {W}}:={\mathscr {W}}_{x^o}\cap T^{-1}({\mathscr {W}}_{y^o})\) satisfies (6.10). Since both points \(x^o\) and \(y^o\) are smooth, by (2.3) in Lemma 2.5, we have

$$\begin{aligned} \frac{H^n\big ({\mathscr {W}}_{x^o}\cap B_o(s)\big )}{H^n\big (B_o(s)\subset T_{x^o}\big )}\geqslant 1-o(s)\quad \mathrm{and}\quad \frac{H^n\big ({\mathscr {W}}_{y^o}\cap B_o(s)\big )}{H^n\big (B_o(s)\subset T_{y^o}\big )} \geqslant 1-o(s). \end{aligned}$$

Note that \(T: T_{x^o}\rightarrow T_{y^o}\) is an isometry (with \(T(o)=o\)). We can get

$$\begin{aligned} \frac{H^n\big ({\mathscr {W}} \cap B_o(s)\big )}{H^n\big (B_o(s)\subset T_{x^o}\big )}=\frac{H^n\big ({\mathscr {W}}_{x^o}\cap T^{-1}({\mathscr {W}}_{y^o})\cap B_o(s)\big )}{H^n\big (B_o(s)\subset T_{x^o}\big )}\geqslant 1-o(s).\qquad \quad \end{aligned}$$
(6.28)

In particular, by taking \(s=\varepsilon _j\), we have that the set \({\mathscr {W}}\) satisfies (6.10).

Now by integrating Eq. (6.27) on \(B_o(\varepsilon _j)\cap {\mathscr {W}}\) and using Lemma 6.4, we have

$$\begin{aligned} \begin{aligned} d_Y\big (u(x^o),u(y^o)\big )\cdot I_2(\varepsilon _j)&\geqslant -\int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}w_{x^o,Q_m}\big (\exp _{x^o}(\eta )\big )dH^n(\eta )\\&\quad -\int _{B_o(\varepsilon _j)\cap {\mathscr {W}}}w_{y^o,Q_m}\big (\exp _{y^o}(T\eta )\big )dH^n(\eta )\\&\geqslant -o\left( \varepsilon ^{n+2}_j\right) . \end{aligned} \end{aligned}$$

Here the last inequality comes from Lemma 6.4. If \(d_Y\big (u(x^o),u(y^o)\big )\not =0\), then this inequality implies that

$$\begin{aligned} I_2(\varepsilon _j)\geqslant -o\left( \varepsilon ^{n+2}_j\right) . \end{aligned}$$
(6.29)

If \(d_Y\big (u(x^o),u(y^o)\big )=0\), then it is simply implied by the definition of \(I_2\) that \(I_2(\varepsilon _j)\geqslant 0\) for all \(j\in {\mathbb {N}}\). Hence, the estimate (6.29) always holds.

(ii) The estimate of \(I_3\).

By setting the function

$$\begin{aligned} g(x):=v(x^o)-v(x) \end{aligned}$$

on B, we have \(g(x^o)=0\) and

$$\begin{aligned} {\mathscr {L}}_g=-{\mathscr {L}}_v=\Big (e^{-2nk\lambda }\cdot \frac{nk}{t}L^2_{t,\lambda }-\theta _0\Big )\cdot {\mathrm{vol}}\quad \ \mathrm{on}\ \ B. \end{aligned}$$

Recall \(L_{t,\lambda }\in L^\infty (B)\) (see Lemma 6.2(ii)). By Lemma 3.1, we know that g is locally Lipschitz on B. Fix some \(r_0>0\) such that \(B_{x^o}(r_0)\subset \subset B\), and denote by \(c_0\) the Lipschitz constant of g on \(B_{x^o}(r_0)\).

Take any \(s<r_0.\) Noticing that \(g(x^o)=0\), we have that \(g(x)+c_0s\geqslant 0\) in \( B_{x^o}(s)\). By using Proposition 3.2, we have

$$\begin{aligned} \begin{aligned}&\frac{1}{H^{n-1}\left( \partial B_o(s)\subset T^{k}_{x^o}\right) }\int _{\partial B_{x^o}(s)}\big (g(x)+c_0s\big )d{\mathrm{vol}}\\&\quad \leqslant \big (g(x^o)+c_0s \big ) +\frac{e^{-2nk\lambda }\cdot \frac{nk}{t}L^2_{t,\lambda }(x^o)-\theta _0}{2n}s^2+o(s^{2}). \end{aligned} \end{aligned}$$

So, we get (notice that \(g(x^o)=0\) )

$$\begin{aligned} \begin{aligned} \int _{\partial B_{x^o}(s)} g(x) d{\mathrm{vol}}&\leqslant c_0s\cdot \Big ( H^{n-1}\Big (\partial B_o(s)\subset T^k_{x^o}\Big )-{\mathrm{vol}}(\partial B_{x^o}(s))\Big )\\&\quad +\Big (e^{-2nk\lambda }\cdot \frac{k}{2t}L^2_{t,\lambda }(x^o)-\frac{\theta _0}{2n}\Big )s^2\cdot H^{n-1}\left( \partial B_o(s)\subset T^k_{x^o}\right) \\&\quad +o(s^{n+1}). \end{aligned} \end{aligned}$$

Notice that Bishop volume comparison theorem implies \({\mathrm{vol}}(\partial B_{x^o}(s))\leqslant H^{n-1}(\partial B_o(s)\subset T^k_{x^o})\). We can use co-area formula to obtain

$$\begin{aligned} \begin{aligned} \int _{B_{x^o}(s)} g(x) d{\mathrm{vol}}&\leqslant c_0s\cdot \Big ( H^{n}\Big (B_o(s)\subset T^k_{x^o}\Big )-{\mathrm{vol}}\Big ( B_{x^o}(s)\Big )\Big )\\&\quad +\Big (e^{-2nk\lambda }\cdot \frac{k}{2t}L^2_{t,\lambda }(x^o)-\frac{\theta _0}{2n}\Big ) \int ^s_0\tau ^2\\&\quad \cdot H^{n-1}\big (\partial B_o(\tau )\subset T^k_{x^o}\big )d\tau +o(s^{n+2}). \end{aligned} \end{aligned}$$
(6.30)

Because that \(x^o\) is a smooth point, we can apply Lemma 2.5 to conclude

$$\begin{aligned} \big |H^{n}\big ( B_o(s) \subset T_{x_0}\big )-{\mathrm{vol}}\big ( B_{x_0}(s)\big )\big |\leqslant o(s)\cdot H^{n}\big ( B_o(s) \subset T_{x_0}\big )=o(s^{n+1}).\nonumber \\ \end{aligned}$$
(6.31)

On the other hand, the fact that \(x^o\) is smooth also implies that \(T^k_{x^o}\) is isometric to \({\mathbb {M}}_k^n\), and hence

$$\begin{aligned} \big |H^{n}\big ( B_o(s) \subset T^k_{x_0}\big )- H^{n}\big ( B_o(s) \subset T_{x_0}\big )\big |=O(s^{n+2}) \end{aligned}$$

and

$$\begin{aligned} H^{n-1}\big (\partial B_o(\tau )\subset T^k_{x^o}\big )= & {} \omega _{n-1}\cdot \Big (\frac{\sinh (\sqrt{-k}\tau )}{\sqrt{-k}}\Big )^{n-1}\\= & {} \omega _{n-1}\cdot \tau ^{n-1}+O(\tau ^{n+1}). \end{aligned}$$

Thus, by substituting this and (6.31) into (6.30), we can get

$$\begin{aligned} \int _{B_{x^o}(s)} g(x) d{\mathrm{vol}}\leqslant \Big (e^{-2nk\lambda }\cdot \frac{k}{2t}L^2_{t,\lambda }(x^o)-\frac{\theta _0}{2n}\Big )\cdot \frac{\omega _{n-1}}{n+2}\cdot s^{n+2} +o(s^{n+2}).\nonumber \\ \end{aligned}$$
(6.32)

Next we want to show that

$$\begin{aligned} \begin{aligned}&\int _{B_{o}(s)\cap {\mathscr {W}}}g(\exp _{x^o}(\eta ))dH^n(\eta )\leqslant \int _{B_{x^o}(s)}g(x)d{\mathrm{vol}}(x)+ o(s^{n+2}) \end{aligned} \end{aligned}$$
(6.33)

for all \(0<s<r_0\).

Since \(x^o\) is a smooth point, we can use Lemma 2.5 to obtain

$$\begin{aligned} \begin{aligned}&\int _{B_{o}(s)\cap {\mathscr {W}}_{x^o}}g(\exp _{x^o}(\eta ))dH^n(\eta )\\&\quad =\int _{B_{x^o}(s)\cap W_{x^o}}g(x)(1+o(s))d{\mathrm{vol}}(x) \\&\quad \leqslant \int _{B_{x^o}(s)}g(x)d{\mathrm{vol}}(x)+\int _{B_{x^o}(s)}|g(x)|\cdot o(s)d{\mathrm{vol}}(x)\\&\quad \leqslant \int _{B_{x^o}(s)}g(x)d{\mathrm{vol}}(x)+\int _{B_{x^o}(s)}O(s)\cdot o(s)d{\mathrm{vol}}(x)\\&\qquad (\mathrm{since}\ g(x) \hbox {is Lipschitz continuous in } B_{x^o}(s)\ \mathrm{and} \ g(x^o)=0).\\&\quad =\int _{B_{x^o}(s)}g(x)d{\mathrm{vol}}(x)+ o(s^{n+2}) \end{aligned} \end{aligned}$$
(6.34)

for all \(0<s<r_0\), where we have used that \(W_{x^o}\) has full measure (please see §2.2).

$$\begin{aligned} \begin{aligned}&\int _{B_{o}(s)\cap {\mathscr {W}}}g(\exp _{x^o}(\eta ))dH^n(\eta )- \int _{B_{o}(s)\cap {\mathscr {W}}_{x^o}}g(\exp _{x^o}(\eta ))dH^n(\eta )\\&\quad \leqslant \int _{B_{o}(s)\cap ({\mathscr {W}}_{x^o}\backslash {\mathscr {W}})}|g(\exp _{x^o}(\eta ))|dH^n(\eta )\\&\quad \leqslant O(s)\cdot {\mathrm{vol}}\big (B_{o}(s)\cap ({\mathscr {W}}_{x^o}\backslash {\mathscr {W}})\big ) \end{aligned} \end{aligned}$$
(6.35)

for all \(0<s<r_0\). Here we have used the fact that g is Lipschitz continuous in \( B_{x^o}(s)\) and \(g(x^o)=0\) again. Recall (6.28) in the previous estimate for \(I_2\). We have

$$\begin{aligned} \begin{aligned} {\mathrm{vol}}\big (B_{o}(s)\cap ({\mathscr {W}}_{x^o}\backslash {\mathscr {W}})\big )&\leqslant {\mathrm{vol}}\big (B_{o}(s) \backslash {\mathscr {W}}\big ) \overset{(6.28)}{\leqslant } o(s)\cdot {\mathrm{vol}}\big (B_{o}(s)\subset T_{x^o}\big )\\&\leqslant o(s^{n+1}). \end{aligned} \end{aligned}$$

By combining this with (6.34)–(6.35), we conclude the desired estimate (6.33).

By taking \(s=\varepsilon _j\) and using (6.32)–(6.33), we obtain the estimate of \(I_3\)

$$\begin{aligned} \begin{aligned} -I_3(\varepsilon _j)&= \int _{B_{o}(\varepsilon _j)\cap {\mathscr {W}}}g(\exp _{x^o}(\eta ))dH^n(\eta )\\&\leqslant \Big (e^{-2nk\lambda }\cdot \frac{k}{2t}L^2_{t,\lambda }(x^o)-\frac{\theta _0}{2n}\Big )\cdot \frac{\omega _{n-1}}{n+2} \cdot \varepsilon _j^{n+2} \\&\quad -o\left( \varepsilon _j^{n+2}\right) , \quad \forall j\in {\mathbb {N}}. \end{aligned} \end{aligned}$$
(6.36)

(iii) The estimate of \(I_4\) and \(I_5\).

Because all of the integrated functions in \(I_4\) and \(I_5\) are semi-concave, we consider the following sublemma.

Sublemma 6.8

Let \(\sigma \in {\mathbb {R}}\) and let f be a \(\sigma \)-concave function near a smooth point z. Then

$$\begin{aligned} \int _{(B_o(s)\cap {\mathscr {W}}_1)\subset T_z}\big (f(\exp _z(\eta ))-f(z)\big )dH^n(\eta )\leqslant \frac{\omega _{n-1}}{2(n+2)}\cdot \sigma \cdot s^{n+2} + o(s^{n+2}) \end{aligned}$$

for any subset \({\mathscr {W}}_1\subset {\mathscr {W}}_z\subset T_z\) with \(H^n(B_o(s)\backslash {\mathscr {W}}_1 )\leqslant o(s^{n+1})\).

Proof

Since f is \(\sigma \)-concave near z, we have

$$\begin{aligned} f(\exp _z(\eta ))-f(z)\leqslant d_zf(\eta )+\frac{\sigma }{2}|\eta |^2 \end{aligned}$$

for all \(\eta \in {\mathscr {W}}_z\). The integration on \(B_o(s)\cap {\mathscr {W}}_1\) tells us

$$\begin{aligned} \int _{B_o(s)\cap {\mathscr {W}}_1 } \big (f(\exp _z(\eta ))-f(z)\big )dH^n \leqslant \int _{B_o(s)\cap {\mathscr {W}}_1 } \big ( d_zf(\eta )+\frac{\sigma }{2}|\eta |^2\big )dH^n.\nonumber \\ \end{aligned}$$
(6.37)

Because f is semi-concave function, we have \(\int _{B_o(s)}d_zf(\eta )dH^n\leqslant 0\) (see Proposition 3.1 of [58]). Thus,

$$\begin{aligned} \begin{aligned} \int _{B_o(s)\cap {\mathscr {W}}_1 } d_zf(\eta )dH^n&\leqslant -\int _{B_o(s)\backslash {\mathscr {W}}_1 } d_zf(\eta )dH^n\\&\leqslant \max _{B_o(s)}|d_zf(\eta )|\cdot H^n(B_o(s)\backslash {\mathscr {W}}_1)\\&\leqslant O(s)\cdot o(s^{n+1})=o(s^{n+2}). \end{aligned} \end{aligned}$$

Similarly, we have

$$\begin{aligned} \begin{aligned} \int _{B_o(s)\cap {\mathscr {W}}_1 } |\eta |^2dH^n&=\int _{B_o(s)} |\eta |^2dH^n-\int _{B_o(s)\backslash {\mathscr {W}} _1} |\eta |^2dH^n\\&=\int _0^s t^2\cdot \omega _{n-1}\cdot t^{n-1}dt-\int _{B_o(s)\backslash {\mathscr {W}}_1 } |\eta |^2dH^n\\&\qquad \qquad \quad (\mathrm{because}\ \ z \ \ \mathrm{is\ smooth} )\\&=\frac{\omega _{n-1}\cdot s^{n+2}}{n+2}+O(s^2)\cdot o(s^{n+1})\\&\qquad \qquad \quad \left( \mathrm{because}\ \ 0\leqslant H^n(B_o(s)\backslash {\mathscr {W}}_1 )\leqslant o(s^{n+1}) \right) . \end{aligned} \end{aligned}$$

Substituting the above two inequalities into Eq. (6.37), we have

$$\begin{aligned} \int _{B_o(s)\cap {\mathscr {W}}_1 } \big (f(\exp _z(\eta ))-f(z)\big )dH^n \leqslant \frac{\omega _{n-1}\cdot \sigma }{2(n+2)}\cdot s^{n+2} + o(s^{n+2}). \end{aligned}$$

This completes the proof of the sublemma. \(\square \)

Now let us use the sublemma to estimate \(I_4\) and \(I_5\).

Note that M has curvature \(\geqslant k\) implies that the function \(\mathrm{dist}^2_q(x):=|qx|^2\) is \(2(\sqrt{-k}|qx|\cdot \coth (\sqrt{-k}|qx|))\)-concave for all \(q\in M\). For all \(q,x\in U\), we have

$$\begin{aligned} 2\sqrt{- k}|qx|\cdot \coth (\sqrt{-k}|qx|)\leqslant & {} 2(1+\sqrt{-k}|qx|)\\\leqslant & {} 2+2\sqrt{-k}\cdot \mathrm{diam}(U):=C_{k,U}. \end{aligned}$$

By combining with that h is (\({-}\)1)-concave and that \(g_i(x)\) is concave for any \(1\leqslant i\leqslant n\), we know that the function \({\widetilde{\gamma }}_1\) is \((\delta _0\cdot C_{k,U}-\delta _1+\delta _1\cdot C_{k,U}/8)\)-concave. Recall that the Eq. (6.28) implies

$$\begin{aligned} H^n(B_o(s)\backslash {\mathscr {W}})\leqslant o(s)\cdot {\mathrm{vol}}\big (B_{o}(s)\subset T^k_{x^o}\big )=o(s^{n+1}). \end{aligned}$$

According to Sublemma 6.8, we obtain (by setting \(s=\varepsilon _j\))

$$\begin{aligned} I_4(\varepsilon _j)\leqslant \kappa (\delta _0,\delta _1)\cdot \frac{\omega _{n-1}}{2(n+2)}\cdot \varepsilon _j^{n+2}+ o\left( \varepsilon _j^{n+2}\right) , \quad \forall j\in {\mathbb {N}}, \end{aligned}$$
(6.38)

where

$$\begin{aligned} \kappa (\delta _0,\delta _1):=(\delta _0\cdot C_{k,U}-\delta _1+\delta _1\cdot C_{k,U}). \end{aligned}$$

Since the map T is an isometry, the same estimate holds for \(I_5\). Namely,

$$\begin{aligned} I_5(\varepsilon _j)\leqslant \kappa (\delta _0,\delta _1)\cdot \frac{\omega _{n-1}}{2(n+2)}\cdot \varepsilon _j^{n+2}+ o\left( \varepsilon _j^{n+2}\right) , \quad \forall j\in {\mathbb {N}}. \end{aligned}$$
(6.39)

Let us recall the Eq. (6.25), (6.26) and combine all of estimates from \(I_1\) to \(I_5\). That is, the equations (6.24), (6.29), (6.36), (6.38) and (6.39). We obtain

$$\begin{aligned} \begin{aligned} 0&\leqslant \bigg [\frac{-k\cdot e^{-2nk\lambda }}{t} |x^oy^o|^2+\frac{e^{-2nk\lambda }\cdot k}{t}L^2_{t,\lambda }(x^o)-\frac{\theta _0}{n}\\&\quad +2\kappa (\delta _0,\delta _1) \bigg ] \frac{\omega _{n-1}}{2(n+2)}\cdot \varepsilon _j^{n+2}\\&\quad + o(\varepsilon _j^{n+2}). \end{aligned} \end{aligned}$$

Thus,

$$\begin{aligned} \frac{-k\cdot e^{-2nk\lambda }}{t} \Big (|x^oy^o|^2-L^2_{t,\lambda }(x^o)\Big )-\frac{\theta _0}{n}+2\kappa (\delta _0,\delta _1)\geqslant 0. \end{aligned}$$
(6.40)

Recall that in Step 2, we have \(H_3(x,y)\) converges to \(H_0(x,y)\) as \(\delta _1\) and \(b_i\) tends to \(0^+\), \(1\leqslant i\leqslant 2n\). Notice that the point \(({\overline{x}},{\overline{y}})\) is the unique minimum of \(H_0\), we conclude that \((x^o,y^o)\) converges to \(({\overline{x}},{\overline{y}})\) as \(\delta _1\rightarrow 0^+\) and \(b_i\rightarrow 0^+\), \(1\leqslant i\leqslant 2n\). Hence, letting \(\delta _1\rightarrow 0^+\) and \(b_i\rightarrow 0^+\), \(1\leqslant i\leqslant 2n\), in (6.40), we obtain

$$\begin{aligned} \frac{-k\cdot e^{-2nk\lambda }}{t}\Big (|{\overline{x}}{\overline{y}}|^2- \liminf _{\delta _1\rightarrow 0^+,\ b_i\rightarrow 0^+}L^2_{t,\lambda }(x^o)\Big )-\frac{\theta _0}{n}+2\cdot \delta _0\cdot C_{k,U}\geqslant 0.\nonumber \\ \end{aligned}$$
(6.41)

On the other hand, by the lower semi-continuity of \(L_{t,\lambda }\) (from Lemma 6.2(i)), we have

$$\begin{aligned} \liminf _{\delta _1\rightarrow 0^+,\ b_i\rightarrow 0^+}L_{t,\lambda }(x^o)\geqslant L_{t,\lambda }({\overline{x}}). \end{aligned}$$

Therefore, by combining with (6.41), (6.19) and the fact \(-k\geqslant 0\), we have

$$\begin{aligned} 0\leqslant -\frac{\theta _0}{n}+2\cdot \delta _0\cdot C_{k,U}=-\frac{\theta _0}{n}+4\cdot \delta _0\cdot \big (1+\sqrt{-k}\cdot \mathrm{diam}(U)\big ). \end{aligned}$$

This contradicts with (6.20) and completes the proof of the Claim, and hence that of the lemma. \(\square \)

Corollary 6.9

Given any domain \(\Omega ''\subset \subset \Omega '\), there exits a constant \(t_1>0\) such that, for each \(t\in (0,t_1)\) and each \(\lambda \in [0,1]\), the function \(x\mapsto f_t(x,\lambda )\) is a super-solution of the Poisson Eq. (6.18) on \(\Omega ''\).

Proof

For any \(p\in \Omega '\), by Lemma 6.7, there exists a neighborhood \(B_p(R_p)\) and a number \(t_p>0\) such that the function \(f_t(\cdot ,\lambda )\) is a super-solution of the Poisson Eq. (6.18) on \(B_p(R_p)\), for each \(t\in (0,t_p)\) and \(\lambda \in [0,1]\).

Given any \(\Omega '' \subset \subset \Omega '\), we have \(\overline{\Omega ''}\subset \cup _{p\in \Omega '} B_p(R_p/2).\) Since \(\overline{\Omega ''}\) is compact, there exist finite \(p_1,p_2,\ldots , p_N\) such that \(\overline{\Omega ''}\subset \cup _{1\leqslant j\leqslant N} B_{p_j}(R_{p_j}/2).\) By the standard construction for partition of unity, there exist Lipschitz functions \(0\leqslant \chi _j\leqslant 1\) on \(\Omega '\) with \(\mathrm{supp}\chi _j\subset B_{p_j}(R_{p_j})\) for each \(j=1,2,\ldots , N\) and \(\sum _{j=1}^N\chi (x)=1\) on \(\Omega ''\).

Take any nonnegative \(\phi \in Lip_0(\Omega '')\). Then \(\chi _j\phi \in Lip_0(B_{p_j}(R_{p_j}))\) for each \(j=1,2,\cdot , N.\) We thus obtain

$$\begin{aligned} \begin{aligned} \int _{\Omega ''}\left<{\nabla f_t(\cdot ,\lambda )},{\nabla \phi }\right>{\mathrm{vol}}&={\mathscr {L}}_{f_t(\cdot ,\lambda )}\left( \sum _{j=1}^N\chi _j\cdot \phi \right) =\sum _{j=1}^N{\mathscr {L}}_{f_t(\cdot ,\lambda )}(\chi _j\cdot \phi )\\&\leqslant \sum _{j=1}^N\int _{U_{p_j}}e^{-2nk\lambda }\cdot \frac{-nk}{t}L^2_{t,\lambda }\cdot (\chi _j\cdot \phi ){\mathrm{vol}}\\&=\int _{\Omega ''}e^{-2nk\lambda }\cdot \frac{-nk}{t}L^2_{t,\lambda } \cdot \phi {\mathrm{vol}}. \end{aligned} \end{aligned}$$

This completes the proof of the corollary. \(\square \)

In the following we want to show that the function \(f_t(\cdot ,\cdot )\) satisfies a parabolic differential inequality \({\mathscr {L}}_{f_t(x,\lambda )}\leqslant \partial f_t/\partial \lambda \).

Given a domain \(G\subset M\) and an interval \(I=(a,b)\), then \(Q=G\times I\) is called a parabolic cylinder in space–time \(M\times {\mathbb {R}}\). For a parabolic cylinder Q, we equip with the product measure

$$\begin{aligned} \underline{\nu }:={\mathrm{vol}}\times {\mathcal {L}}^1. \end{aligned}$$

When \(G=B_{x_0}(r)\) and \(I=I_{\lambda _0}(r^2):=(\lambda _0-r^2,\lambda _0+r^2)\), we denote by the cylinder

$$\begin{aligned} Q_r(x_0,\lambda _0):= B_{x_0}(r)\times I_{\lambda _0}(r^2). \end{aligned}$$

If without confusion arises, we shall write it as \(Q_r\).

The theory for local weak solution of the heat equation on metric spaces has been developed by Sturm in [56] and, recently, by Kinnunen–Masson [32], Marola–Masson [41]. According to Lemma 6.1(iv), our auxiliary functions \(f_t(x,\lambda )\) are in \(W^{1,2}(\Omega ''\times (0,1))\). So we consider only the weak solution in \(W^{1,2}_{\mathrm{loc}}(Q)\). In such a case, the definition of weak solution of the heat equation can be simplified as follows.

Definition 6.10

Let \(Q=G\times I\) be a cylinder. A function \(g(x,\lambda )\in W^{1,2}_{\mathrm{loc}}(Q)\) is said a (weak) super-solution of the heat equation

$$\begin{aligned} {\mathscr {L}}_{g}= \frac{\partial g}{\partial \lambda }\quad \mathrm{on}\ \ Q, \end{aligned}$$
(6.42)

if it satisfies

$$\begin{aligned} -\int _{Q}\left<{\nabla g},{\nabla \phi }\right>d\underline{\nu }(x,\lambda )\leqslant \int _{Q}\frac{\partial g}{\partial \lambda }\cdot \phi d\underline{\nu }(x,\lambda ) \end{aligned}$$

for all nonnegative function \(\phi \in Lip_0(Q)\).

A function \(g(x,\lambda )\) is said a sub-solution of the Eq. (6.42) on Q if \(-g(x,\lambda )\) is a super-solution on Q. A function \(g(x,\lambda )\) is said a local weak solution of the Eq. (6.42) on Q if it is both sub-solution and super-solution on Q.

Remark 6.11

The test functions \(\phi \) in the above Definition 6.10 also can be chosen in Lip(Q) such that, for each \(\lambda \in I\), the function \( \phi (\cdot ,\lambda )\) is in \(Lip_0(G).\) That is, it vanishes only on the lateral boundary \(\partial G\times I\).

Lemma 6.12

Let \(Q=G\times I\) be a cylinder. Suppose a function \(g(x,\lambda )\in W^{1,2}_{\mathrm{loc}}(Q)\). If, for almost all \(\lambda \in I\), the function \(x\mapsto g(x,\lambda )\) is a super-solution of the Poisson equation

$$\begin{aligned} {\mathscr {L}}_{g}=\frac{\partial g}{\partial \lambda }\cdot {\mathrm{vol}}\quad \mathrm{on}\ \ G. \end{aligned}$$
(6.43)

Then \(g(x,\lambda )\) is a super-solution of the heat equation

$$\begin{aligned} {\mathscr {L}}_{g}=\frac{\partial g}{\partial \lambda } \quad \mathrm{on}\ \ Q. \end{aligned}$$

Proof

Take any nonnegative function \(\phi (x,\lambda )\in Lip_0(Q)\). Then, for each \(\lambda \in I\), the function \(\phi (\cdot ,\lambda )\) is in \(Lip_0(G).\) For almost all \(\lambda \in I\), since the function \(g(\cdot ,\lambda )\) is a super-solution of the Poisson Eq. (6.43) on G, we have

$$\begin{aligned} -\int _{G}\left<{\nabla g},{\nabla \phi }\right>d{\mathrm{vol}}=\int _{G}\phi d{\mathscr {L}}_{g} \leqslant \int _{G}\phi \cdot \frac{\partial g}{\partial \lambda }d{\mathrm{vol}}. \end{aligned}$$
(6.44)

Notice that \(g(x,\lambda )\in W_{\mathrm{loc}}^{1,2}(Q)\) and \(\phi (x,\lambda )\in Lip_0(Q)\), we know that \(|\left<{\nabla g},{\nabla \phi }\right>|\in L^2(Q)\) and that \(\phi \cdot \frac{\partial g}{\partial \lambda }\in L^2(Q).\) By using Fubini Theorem, we obtain

$$\begin{aligned} \begin{aligned} -\int _{G\times I}\left<{\nabla g(x,\lambda )},{\nabla \phi (x,\lambda )}\right>d\underline{\nu }(x,\lambda )&=-\int _I\int _{G}\left<{\nabla g},{\nabla \phi }\right>d{\mathrm{vol}}d\lambda \\ \overset{(6.44)}{\leqslant }\int _I \int _{G}\phi \cdot \frac{\partial g}{\partial \lambda } d{\mathrm{vol}}d\lambda&=\int _{G\times I}\phi \cdot \frac{\partial g}{\partial \lambda } d\underline{\nu }(x,\lambda ). \end{aligned} \end{aligned}$$

Thus, \(g(x,\lambda )\) is a super-solution of the heat equation \({\mathscr {L}}_{g}=\frac{\partial g}{\partial \lambda }\) on Q. \(\square \)

Now we are ready to show that the function \((x,\lambda )\mapsto f_t(x,\lambda )\) is a super-solution of the heat equation.

Proposition 6.13

Given any \(\Omega ''\subset \subset \Omega '\), and let \(t_*:=\min \{t_0,t_1\}\), where \(t_0\) is given in Lemma 6.1, and \(t_1\) is given in Corollary 6.9. Then, for each \(t\in (0,t_*)\), the function \((x,\lambda )\mapsto f_t(x,\lambda )\) is a super-solution of

$$\begin{aligned} {\mathscr {L}}_{f_t(x,\lambda )} = \frac{\partial f_t(x,\lambda )}{\partial \lambda } \end{aligned}$$
(6.45)

on the cylinder \(\Omega ''\times (0,1)\).

Proof

From Lemma 6.1(iv), we know that \(f_t(x,\lambda )\in W^{1,2}(\Omega ''\times (0,1))\) for all \(t\in (0,t_*).\) According to Corollary 6.9, for each \(\lambda \in [0,1]\), the function \(f_{t}(\cdot ,\lambda )\) is a super-solution of the Poisson equation

$$\begin{aligned} {\mathscr {L}}_{f_t(\cdot ,\lambda )}=-e^{-2nk\lambda }\cdot \frac{nk}{t}L^2_{t,\lambda }\cdot {\mathrm{vol}}\qquad \mathrm{on }\ \ \Omega ''. \end{aligned}$$

On the other hand, by Lemma 6.3, we have

$$\begin{aligned} \frac{\partial f_t(x,\lambda )}{\partial \lambda }\geqslant - e^{-2nk\lambda }\cdot \frac{nk}{t}L^2_{t,\lambda }(x)\qquad \qquad \end{aligned}$$
(6.46)

for \(\underline{\nu }\)-a.e. \((x,\lambda )\in \Omega ''\times (0,1)\). We know that \(\frac{\partial f_t}{\partial \lambda }\in L^2(\Omega ''\times (0,1))\) from Lemma 6.1(iv). By Fubini’s theorem, we get that, for almost all \(\lambda \in (0,1)\), the Eq. (6.46) holds for almost all \(x\in \Omega ''.\) Hence, for almost all \(\lambda \in (0,1)\), we have

$$\begin{aligned} {\mathscr {L}}_{f_t(\cdot ,\lambda )}\leqslant \frac{\partial f_t(x,\lambda )}{\partial \lambda }\cdot {\mathrm{vol}}\qquad \mathrm{on }\ \ \Omega ''. \end{aligned}$$

Therefore, the proposition follows from Lemma 6.12. \(\square \)

6.2 Lipschitz continuity of harmonic maps

In this subsection, we will prove our main Theorem 1.4.

We need the following weak Harnack inequality for sub-solutions of the heat equation (see Theorem 2.1 [56] or Lemma 4.2 [41]).

Lemma 6.14

[41, 56] Let \(G\times I\) be a parabolic cylinder in \(M\times {\mathbb {R}}\), and let \(g(x,\lambda )\) be a nonnegative, local bounded sub-solution of the heat equation \({\mathscr {L}}_{g}=\frac{\partial g}{\partial \lambda }\) on \(Q_r\subset G\times I\). Then there exists a constant \(C=C(n,k,\mathrm{diam}G)\), depending only on nk and \(\mathrm{diam}G\), such that we have

$$\begin{aligned} \mathrm{ess}\sup _{Q_{r/2}}g\leqslant \frac{C}{r^2\cdot {\mathrm{vol}}\big (B_x(r)\big )}\int _{Q_{r}}gd\underline{\nu }. \end{aligned}$$
(6.47)

Fix any domain \(\Omega '\subset \subset \Omega \). For any \(t>0\) and any \(0\leqslant \lambda \leqslant 1\), the function \(f_t(x,\lambda )\) is given in (6.1). Notice that

$$\begin{aligned} 0\leqslant -f_t(x,\lambda )\leqslant \mathrm{osc}_{\overline{\Omega '}}u. \end{aligned}$$
(6.48)

The following lemma is essentially a consequence of the above weak Harnack inequality.

Lemma 6.15

Let \(R\leqslant 1\) and let ball \(B_q(2R)\subset \subset \Omega '\). Suppose that \(t_*\) is given in Proposition 6.13 for \(\Omega ''=B_q(2R)\). For each \(t\in (0,t_*)\) and \(\lambda \in (0,1)\), we define the function \(x\rightarrow |\nabla ^-f_t(x,\lambda )|\) on \(B_q(2R)\) by

$$\begin{aligned} |\nabla ^-f_t(x,\lambda )|:=\limsup _{r\rightarrow 0}\sup _{y\in B_x(r)}\frac{\big (f_t(x,\lambda )-f_t(y,\lambda )\big )_+}{r}\qquad \forall x\in B_q(2R),\nonumber \\ \end{aligned}$$
(6.49)

where \(a_+=\max \{a,0\}.\)

Then, there exists a constant \(C_1(n,k,R)\) such that

$$\begin{aligned} \frac{1}{{\mathrm{vol}}\big (B_q(R)\big )}\int _{B_q(R)\times (\frac{1}{4},\frac{3}{4})}|\nabla ^-f_t(x,\lambda )|^2d\underline{\nu }\leqslant C_1(n,k,R)\cdot \mathrm{osc}_{\overline{\Omega '}}^2 u\nonumber \\ \end{aligned}$$
(6.50)

holds for all \(t\in (0,t_*)\).

Proof

1. First, let us consider an arbitrary function \(h\in W^{1,2}\big (B_q(R)\big )\). Take any \(\Omega _1\subset \subset B_q(R)\). According to the Theorem 3.2 of [18], there exists a constant \({\overline{C}}={\overline{C}}(\Omega _1,B_q(R))\) such that for almost all \(x,y\in \Omega _1\) with \(|xy|\leqslant \mathrm{dist}(\Omega _1,\partial B_q(R))/{\overline{C}}\), we have

$$\begin{aligned} |h(x)-h(y)|\leqslant |xy|\cdot \Big (M(|\nabla h|)(x)+M(|\nabla h|)(y)\Big ), \end{aligned}$$

where Mw is the Hardy–Littlewood maximal function for the function \(w\in L^1_{\mathrm{loc}}(B_q(R))\)

$$\begin{aligned} Mw(x)=\sup _{s>0}\frac{1}{{\mathrm{vol}}(B_x(s))}\int _{B_x(s)\cap B_q(R)}|w|d{\mathrm{vol}}. \end{aligned}$$

Hence, for almost all \(x\in \Omega _1\), we have

$$\begin{aligned} \begin{aligned}&\fint _{B_{x}(r)}|h(x)-h(y)|d{\mathrm{vol}}(y)\\&\quad \leqslant r \cdot \fint _{B_{x}(r)}\Big (M(|\nabla h|)(x)+M(|\nabla h|)(y)\Big )d{\mathrm{vol}}(y)\\&\quad \leqslant r \cdot \Big (M(|\nabla h|)(x)+M[(M(|\nabla h|)](x)\Big ) \end{aligned} \end{aligned}$$
(6.51)

for any \(r<\mathrm{dist}(\Omega _1,\partial B_q(R))/{\overline{C}}\).

2. Fix any \(t\in (0,t_*)\). We first introduce a function \(F(x,\lambda )\) on \( B_q(R)\times (0,1)\) as

$$\begin{aligned} F(x,\lambda ):=\limsup _{r\rightarrow 0}\frac{1}{r}\cdot \fint _{I_\lambda (r^2)}\fint _{B_x(r)}\big |f_t(x,\lambda )-f_t(x',\lambda ')\big |d{\mathrm{vol}}(x')d\lambda ' \end{aligned}$$

for any \((x,\lambda )\in \ B_q(R)\times (0,1)\), where \(I_\lambda (r^2)=(\lambda -r^2,\lambda +r^2)\). We claim that there exists a constant \(C_2(n,k,R)\) such that

$$\begin{aligned} \int _{ B_q(R)}F^2(x,\lambda )d{\mathrm{vol}}(x)\leqslant C_2(n,k,R)\cdot \int _{ B_q(R)}|\nabla f_t(x,\lambda )|^2d{\mathrm{vol}}(x)\nonumber \\ \end{aligned}$$
(6.52)

holds for all \(\lambda \in (0,1)\).

To justify this, let us fix any \(\lambda \in (0,1)\). According to Lemma 6.1(ii), we have \(f_t(\cdot ,\lambda ) \in W^{1,2}\big (B_q(R)\big ).\) Take any \(\Omega _1\subset \subset B_q(R)\). By using (6.51) to the function \(f_t(\cdot ,\lambda ) \), we obtain that, for almost all \(x\in \Omega _1\),

$$\begin{aligned}&\fint _{B_{x}(r)}|f_t(x,\lambda ) -f_t(x',\lambda )| d{\mathrm{vol}}(x') \nonumber \\&\quad \leqslant r \cdot \Big (M(|\nabla f_t(\cdot ,\lambda ) |)(x)+M[(M(|\nabla f_t(\cdot ,\lambda ) |)](x)\Big ) \end{aligned}$$
(6.53)

for all \(r<\mathrm{dist}(\Omega _1,\partial B_q(R))/{\overline{C}}(\Omega _1, B_q(R))\). Thus, for almost all \(x\in \Omega _1\), we can use Lemma 6.1(iii) to conclude

$$\begin{aligned} \begin{aligned} G_r(x,\lambda ):&= \frac{1}{r}\cdot \fint _{I_\lambda (r^2)}\fint _{B_x(r)}\big |f_t(x,\lambda )-f_t(x',\lambda ')\big |d{\mathrm{vol}}(x')d\lambda '\\&\leqslant \frac{1}{r}\cdot \fint _{I_\lambda (r^2)}\fint _{B_x(r)}\Big (\big |f_t(x',\lambda ')-f_t(x',\lambda )\big |+\big |f_t(x',\lambda )-f_t(x,\lambda )\big |\Big )d{\mathrm{vol}}(x')d\lambda '\\&\overset{(6.4)}{\leqslant } \frac{ e^{-2nk}\cdot C_*}{r}\cdot \fint _{I_\lambda (r^2)}|\lambda -\lambda '|d\lambda '\\&\quad + \frac{1}{r} \fint _{B_x(r)}\fint _{I_\lambda (r^2)}\big |f_t(x',\lambda )-f_t(x,\lambda )\big |d\lambda 'd{\mathrm{vol}}(x')\\&\overset{(6.53)}{\leqslant } e^{-2nk}\cdot C_*\cdot r + M(|\nabla f_t(\cdot ,\lambda ) |)(x)+M[(M(|\nabla f_t(\cdot ,\lambda ) |)](x),\\ \end{aligned} \end{aligned}$$

for all sufficiently small \(r>0,\) where we have used \(|\lambda '-\lambda |\leqslant r^2\). By the definition of \(F(x,\lambda )\), we have

$$\begin{aligned} F(x,\lambda )=\limsup _{r\rightarrow 0}G_r(x,\lambda )\leqslant M(|\nabla f_t(\cdot ,\lambda ) |)(x)+M[(M(|\nabla f_t(\cdot ,\lambda ) |)](x)\nonumber \\ \end{aligned}$$
(6.54)

for almost all \(x\in \Omega _1\). By the arbitrariness of \(\Omega _1\subset \subset B_q(R)\), we know that (6.54) holds for almost all \(x\in B_q(R).\) Now the desired estimate (6.52) is implied by the \(L^2\)-boundedness of maximal operator (see, for example, Theorem 14.13 in [18]). Notice that the norm \(\Vert M\Vert _{L^2\rightarrow L^2}\) of maximal operator depends only on the doubling constant of \(B_q(R)\); and hence, it depends only on nk and R.

According to Proposition 6.13, the function \((x,\lambda )\mapsto -f_t(x,\lambda )\) is a nonnegative sub-solution of the heat equation on \(B_q(2R)\times (0,1)\). By using the parabolic version of Caccioppoli inequality (Lemma 4.1 in [41]), we can get

$$\begin{aligned}&\sup _{\frac{1}{4}\leqslant \lambda \leqslant \frac{3}{4}}\int _{B_q(R)}f_t^2(\cdot ,\lambda )d{\mathrm{vol}}+\int _{B_q(R)\times (\frac{1}{4}, \frac{3}{4})}|\nabla f_t|^2d\underline{\nu }\\&\quad \leqslant C_3(n,k,R)\cdot \int _{B_q(2R)\times (0,1)}f_t^2d\underline{\nu }, \end{aligned}$$

where we have used that \(R\leqslant 1\). In particular, by combining with (6.48), we have

$$\begin{aligned} \int _{B_q(R)\times (\frac{1}{4}, \frac{3}{4})}|\nabla f_t|^2d\underline{\nu }\leqslant C_3(n,k,R)\cdot {\mathrm{vol}}\big (B_q(2R)\big )\cdot \mathrm{osc}_{\overline{\Omega '}}^2u. \end{aligned}$$
(6.55)

On the other hand, fix any \((x,\lambda )\in B_q(R)\times (0, 1)\). From Proposition 6.13, we know that the function \(\big ( f_t(x,\lambda )-f_t(\cdot ,\cdot )\big )_+\) is a sub-solution of the heat equation on \(B_q(R)\times (0,1).\) According to Lemma 6.14 (noticing that \(f_t\) is continuous), there exists a constant \(C_4(n,k,R)\) such that

$$\begin{aligned}&\sup _{Q_{r/2}(x,\lambda )}\big (f_t(x,\lambda )-f_t(x',\lambda ')\big )_+\\&\quad \leqslant \frac{C_4(n,k,R)}{r^2\cdot {\mathrm{vol}}\big (B_x(r)\big )}\int _{Q_r(x,\lambda )}\big |f_t(x,\lambda )-f_t(x',\lambda ')\big |d\underline{\nu }(x',\lambda ') \end{aligned}$$

for all \(Q_r(x,\lambda )=B_x(r)\times I_\lambda (r^2)\subset \subset B_q(R)\times (0,1).\) Hence, by the definition of \(|\nabla ^-f_t|\) and F, we have

$$\begin{aligned} |\nabla ^-f_t(x,\lambda )|\leqslant 2C_4(n,k,R)\cdot F(x,\lambda ),\qquad \forall (x,\lambda )\in B_q(R)\times (0,1).\nonumber \\ \end{aligned}$$
(6.56)

By integrating (6.56) on \(B_q(R)\times (\frac{1}{4},\frac{3}{4})\) and combining with (6.52), (6.55), we have

$$\begin{aligned} \int _{B_q(R)\times (\frac{1}{4},\frac{3}{4})}|\nabla ^-f_t(x,\lambda )|^2d\underline{\nu }\leqslant 4C^2_4\cdot C_2\cdot C_3\cdot {\mathrm{vol}}\big (B_q(2R)\big )\cdot \mathrm{osc}_{\overline{\Omega '}}^2 u. \end{aligned}$$

By combining this with \({\mathrm{vol}}\big (B_q(2R)\big )\leqslant C_5(n,k,R)\cdot {\mathrm{vol}}\big (B_q(R)\big )\), we get the desired estimate (6.50). \(\square \)

Now we are in the position to prove the main theorem.

Proof of the Theorem 1.4

Let us fix a ball \(B_q(R)\) with \(B_q(2R)\subset \Omega \) and denote by \(\Omega '=B_q(R)\). Let \({\overline{t}}=\min \{t_*,R^2/(64+64\mathrm{osc}_{\overline{\Omega '}}u)\}\), where \(t_*\) is given in Proposition 6.13 for \(\Omega ''=B_q(R/2)\). Denote by

$$\begin{aligned} v(t,x,\lambda ):=-f_t(x,\lambda ),\qquad (t,x,\lambda )\in (0,{\overline{t}})\times B_q(R/2)\times [0,1]. \end{aligned}$$

According to Proposition 6.13, for each \(t\in (0,{\overline{t}})\), the function \(v(t,\cdot ,\cdot )\) is a sub-solution of the heat equation on the cylinder \(B_q(R/2)\times (0,1)\). \(\square \)

Next, we want to estimate \(\frac{\partial ^+}{\partial t}v(t,x,\lambda )\).

Sublemma 6.16

For any \(t\in (0,{\overline{t}})\) and any \((x,\lambda )\in B_q(R/4)\times (0,1)\), we have

$$\begin{aligned} \begin{aligned} \frac{\partial ^+}{\partial t}v(t,x,\lambda ):&=\limsup _{s\rightarrow 0^+}\frac{v(t+s,x,\lambda )-v(t,x,\lambda )}{s}\\&\leqslant \mathrm{Lip}^2u(x)+|\nabla ^-f_t(x,\lambda )|^2 \end{aligned} \end{aligned}$$
(6.57)

Proof

For the convenience, we denote by

$$\begin{aligned} \rho (x,y):=d_Y\big (u(x),u(y)\big ) \end{aligned}$$

in the proof of this Sublemma.

Fix any \((x,\lambda )\in B_q(R/4)\times [0,1]\) and \(t+s\leqslant {\overline{t}}\). We can apply Lemma 6.1(i) to conclude

$$\begin{aligned} v(t+s,x,\lambda )= \sup _{y\in B_q(R/2)}\left\{ \rho (x,y)-e^{-2nk\lambda }\cdot \frac{|xy|^2}{2(t+s)}\right\} . \end{aligned}$$

We claim firstly that

$$\begin{aligned} \frac{|xy|^2}{2(t+s)}=\inf _{z\in \Omega '}\left\{ \frac{|xz|^2}{2s}+\frac{|yz|^2}{2t}\right\} . \end{aligned}$$

To justify this, we notice that, by the triangle inequality, any minimal geodesic \(\gamma \) between x and y is in \( B_q(R)\). By taking \(z\in \gamma \) with \(|xz|=\frac{s}{s+t}|xy|\), we conclude that the left hand side of the above is greater than the right hand side. The converse is implied by the triangle inequality.

Thus, we have

$$\begin{aligned} \begin{aligned} v(t+s,x,\lambda )&=\sup _{y\in B_q(R/2)}\left\{ \rho (x,y)-e^{-2nk\lambda }\cdot \inf _{z\in \Omega '}\left\{ \frac{|xz|^2}{2s}+\frac{|yz|^2}{2t}\right\} \right\} \\&=\sup _{y\in B_q(R/2)}\sup _{z\in \Omega '}\left\{ \rho (x,y)-e^{-2nk\lambda }\cdot \frac{|xz|^2}{2s}-e^{-2nk\lambda }\cdot \frac{|yz|^2}{2t}\right\} \\&\leqslant \sup _{z\in \Omega '}\sup _{y\in \Omega '}\left\{ \rho (x,z)+\rho (y,z)-e^{-2nk\lambda }\cdot \frac{|xz|^2}{2s}-e^{-2nk\lambda }\cdot \frac{|yz|^2}{2t}\right\} \\&\quad (\hbox {by the triangle inequality})\\&= \sup _{z\in \Omega '}\left\{ \rho (x,z)-e^{-2nk\lambda }\cdot \frac{|xz|^2}{2s}+v(t,z,\lambda )\right\} . \end{aligned} \end{aligned}$$

Hence, we can get

$$\begin{aligned} \begin{aligned}&\frac{v(t+s,x,\lambda )-v(t,x,\lambda )}{s}\\&\quad \leqslant \sup _{z\in \Omega '}\left\{ \frac{\rho (x,z)+v(t,z,\lambda )-v(t,x,\lambda )}{s}-e^{-2nk\lambda }\cdot \frac{|xz|^2}{2s^2}\right\} \\&\quad \leqslant \sup _{z\in \Omega '}\left\{ \frac{\rho (x,z)+v(t,z,\lambda )-v(t,x,\lambda )}{s}- \frac{|xz|^2}{2s^2}\right\} :=RHS, \end{aligned} \end{aligned}$$
(6.58)

where we have used that \(k\leqslant 0\). It is clear that \(RHS\geqslant 0\) (by taking \(z=x\)). On the other hand, if \(|xz|\geqslant s^{1/4},\) then

$$\begin{aligned} \frac{\rho (x,z)+v(t,z,\lambda )-v(t,x,\lambda )}{s}- \frac{|xz|^2}{2s^2}\leqslant & {} \frac{3\cdot \mathrm{osc}_{\overline{\Omega '}}u}{s}-\frac{s^{2/4}}{2s^2}\\\leqslant & {} \frac{6\mathrm{osc}_{\overline{\Omega '}}u-s^{-1/2}}{2s}<0 \end{aligned}$$

for any \(0<s< (6\mathrm{osc}_{\overline{\Omega '}}u)^{-2}\). Hence,

$$\begin{aligned} RHS=\sup _{|xz|<s^{1/4}}\left\{ \frac{\rho (x,z)+v(t,z,\lambda )-v(t,x,\lambda )}{s}-\frac{|xz|^2}{2s^2}\right\} \end{aligned}$$

for all sufficiently small \(s>0\). Now let us continue the calculation of (6.58). By using Cauchy–Schwarz inequality, we have

$$\begin{aligned} \begin{aligned}&\frac{v(t+s,x,\lambda )-v(t,x,\lambda )}{s}\\&\quad \leqslant \sup _{|xz|<s^{1/4}}\left\{ \frac{\rho (x,z)+v(t,z,\lambda )-v(t,x,\lambda )}{s}-\frac{|xz|^2}{2s^2}\right\} \\&\quad \leqslant \sup _{|xz|<s^{1/4}}\left\{ \left( \frac{\rho (x,z)}{|xz|}+\frac{[v(t,z,\lambda )-v(t,x,\lambda )]_+}{|xz|}\right) \cdot \frac{|xz|}{s}-\frac{|xz|^2}{2s^2}\right\} \\&\quad \leqslant \frac{1}{2}\sup _{|xz|<s^{1/4}}\left( \frac{\rho (x,z)}{|xz|}+\frac{[f_t(x,\lambda )-f_t(z,\lambda )]_+}{|xz|}\right) ^2 \end{aligned} \end{aligned}$$

for all sufficiently small \(s>0\). Letting \(s\rightarrow 0^+\), we get the desired Eq. (6.57). This completes the proof the sublemma. \(\square \)

Sublemma 6.17

We define a function \({\mathscr {H}}(t)\) on \((0,{\overline{t}})\) by

$$\begin{aligned} {\mathscr {H}}(t):=\frac{1}{{\mathrm{vol}}\big (B_q(R/4)\big )}\int _{B_q(R/4)\times (\frac{1}{4}, \frac{3}{4})}v(t,x,\lambda )d\underline{\nu }(x,\lambda ),\qquad t\in (0,{\overline{t}}). \end{aligned}$$

Then \({\mathscr {H}}(t)\) is locally Lipschitz in \((0,{\overline{t}})\).

Proof

For the convenience, we continue to denote by \(\rho (x,y)\!:=\!d_Y\big (u(x),u(y)\big )\) in the proof of this Sublemma. Given any interval \([a,b]\subset (0,{\overline{t}})\), we have to show that \({\mathscr {H}}(t)\) is Lipschitz continuous in [ab].

Let us fix any \(t,t'\in [a,b]\). Take any \((x,\lambda )\in B_q(R/4)\times (0,1)\) and let \(y\in \Omega '\) achieve the maximum in the definition of \(v(t',x,\lambda )\). Then we have

$$\begin{aligned} \begin{aligned} v(t',x,\lambda )-v(t,x,\lambda )&=\rho (x,y)-e^{-2nk\lambda }\frac{|xy|^2}{2t'}-\sup _{z\in \Omega '}\left\{ \rho (x,z)\right. \\&\quad \left. -e^{-2nk\lambda }\frac{|xz|^2}{2t}\right\} \\&\leqslant e^{-2nk\lambda }\cdot \frac{|xy|^2}{2} \cdot \left( \frac{1}{t}-\frac{1}{t'}\right) \\&\leqslant e^{-2nk}\cdot \frac{\mathrm{diam}^2(\Omega ')}{2}\cdot \frac{|t'-t|}{a^2}, \end{aligned} \end{aligned}$$

where we have used that \(k\leqslant 0\), \(\lambda \leqslant 1\) and \(t',t\geqslant a\). By the symmetry of t and \(t'\), we have

$$\begin{aligned} |v(t',x,\lambda )-v(t,x,\lambda )|\leqslant e^{-2nk}\cdot \frac{\mathrm{diam}^2(\Omega ')}{2a^2}\cdot |t'-t|. \end{aligned}$$

The integration of this on \(B_q(R/4)\times (\frac{1}{4},\frac{3}{4})\) implies the Lipschitz continuity of \({\mathscr {H}}(t)\) on [ab]. Therefore, the proof of sublemma is complete. \(\square \)

Now let us continue to prove the proof of Theorem 1.4.

Fixed every \(t>0\), from the Sublemma 6.16 and Sublemma 6.17, we can apply dominated convergence theorem to conclude

$$\begin{aligned} \begin{aligned} \frac{d^+}{dt}{\mathscr {H}}(t)&=\limsup _{s\rightarrow 0^+}\frac{1}{{\mathrm{vol}}\big (B_q(R/4)\big )}\int _{B_q(R/4)\times (\frac{1}{4}, \frac{3}{4})}\frac{v(t+s,x,\lambda )-v(t,x,\lambda )}{s}d\underline{\nu } \\&\leqslant \frac{1}{{\mathrm{vol}}\big (B_q(R/4)\big )}\int _{B_q(R/4)\times (\frac{1}{4}, \frac{3}{4})}\limsup _{s\rightarrow 0^+}\frac{v(t+s,x,\lambda )-v(t,x,\lambda )}{s}d\underline{\nu }\qquad \quad \\&\leqslant \frac{1}{{\mathrm{vol}}\big (B_q(R/4)\big )}\int _{B_q(R/4)\times (\frac{1}{4}, \frac{3}{4})}\Big ( \mathrm{Lip}^2u(x)+|\nabla ^- f_t(x,\lambda )|^2(x)\Big )d\underline{\nu }. \end{aligned} \end{aligned}$$
(6.59)

Since \(B_q(3R/2)\subset \subset \Omega \), we can use Theorem 5.5 to obtain

$$\begin{aligned}&\int _{B_q(R/4)}\mathrm{Lip}^2u(x)d{\mathrm{vol}}(x)\\&\quad \leqslant C_1\cdot \int _{B_q(R/4)}|\nabla u|_2(x)d{\mathrm{vol}}(x)\leqslant C_1\cdot E^u_2\big (B_q(R/4)\big ). \end{aligned}$$

Here and in the following of the proof, all of constants \(C_1,C_2,\ldots , \) depend only on nk and R. By combining with Lemma 6.15 and (6.59), we have

$$\begin{aligned} \frac{d^+}{dt}{\mathscr {H}}(t)\leqslant & {} \frac{C_1}{2}\cdot \frac{ E^u_2\big (B_q(R/4)\big )}{{\mathrm{vol}}\big (B_q(R/4)\big )}+ C_2 \cdot \mathrm{osc}_{\overline{\Omega '}}^2u\\\leqslant & {} C_3\bigg (\frac{ E^u_2\big (B_q(R)\big )}{{\mathrm{vol}}\big (B_q(R)\big )}+ \mathrm{osc}_{\overline{\Omega '}}^2u\bigg ), \end{aligned}$$

where we have used that \({\mathrm{vol}}\big (B_q(R)\big )\leqslant C(n,k,R)\cdot {\mathrm{vol}}\big (B_q(R/4)\big )\). Denoting by

$$\begin{aligned} {\mathscr {A}}_{u,R}:=\bigg (\frac{ E^u_2\big (B_q(R)\big )}{{\mathrm{vol}}\big (B_q(R)\big )}\bigg )^{\frac{1}{2}}+\mathrm{osc}_{\overline{B_q(R)}}u, \end{aligned}$$

we have \(\frac{d^+}{dt}{\mathscr {H}}(t) \leqslant 2C_3\cdot {\mathscr {A}}^2_{u,R}.\)

We notice that \(\lim _{t\rightarrow 0^+}v(t,x,\lambda )=0\) for each given \((x,\lambda )\in B_q(R/4)\times (0,1)\). Indeed, from Lemma 6.1(i),

$$\begin{aligned} v(t,x,\lambda )= & {} \max _{\overline{B_x(\sqrt{C_*t})}}\left\{ d_Y(u(x),u(y))-e^{-2nk\lambda }\frac{|xy|^2}{2t}\right\} \\\leqslant & {} \max _{\overline{B_x(\sqrt{C_*t})}} d_Y(u(x),u(y)). \end{aligned}$$

By combining this with the continuity of u, we deduce that \(\lim _{t\rightarrow 0^+}v(t,x,\lambda )=0\). Since \(v(t,\cdot ,\cdot )\) is bounded from (6.48), we can use dominated convergence theorem to conclude that \(\lim _{t\rightarrow 0^+}{\mathscr {H}}(t)=0\). By combining this with Sublemma 6.17 and \(\frac{d^+}{dt}{\mathscr {H}}(t) \leqslant 2C_3\cdot {\mathscr {A}}^2_{u,R}\), we have

$$\begin{aligned} {\mathscr {H}}(t)\leqslant 2C_3\cdot t\cdot {\mathscr {A}}^2_{u,R}. \end{aligned}$$
(6.60)

for any \(t\in (0,{\overline{t}})\),

Let us recall Proposition 6.13 that, for each \(t\in (0,{\overline{t}})\), the function \(v(t,\cdot ,\cdot )\) is nonnegative and a sub-solution of the heat equation on the cylinder \(B_q(R/2)\times (0,1)\), hence so is the function \(\frac{v(t,\cdot ,\cdot )}{t}.\) By using Lemma 6.14 and \(R\leqslant 1\), we obtain

$$\begin{aligned} \sup _{B_q(R/8)\times (\frac{3}{8},\frac{5}{8})}\frac{v(t,x,\lambda )}{t}&\leqslant \frac{C_4}{R^2\cdot {\mathrm{vol}}\big (B_q(R/4)\big )}\int _{B_q(R/4)\times (\frac{1}{4},\frac{3}{4}) }\frac{v(t,x,\lambda )}{t}d\underline{\nu }(x,\lambda ) \nonumber \\&\quad = \frac{C_4}{R^2}\cdot \frac{{\mathscr {H}}(t)}{t} \overset{6.60}{\leqslant } \frac{C_4}{R^2}\cdot 2 C_3\cdot {\mathscr {A}}^2_{u,R}:= C_5\cdot {\mathscr {A}}^2_{u,R}. \end{aligned}$$
(6.61)

Given any \(x,y\in B_q(R/8)\), from the definition of \(v(t,x,\lambda )\), we can apply (6.61) to \(v(t,x,\frac{1}{2})\) and deduce

$$\begin{aligned} \frac{d_Y\big (u(x),u(y)\big )}{t}-e^{-nk}\frac{|xy|^2}{2t^2} \leqslant \frac{v\left( t,x,\frac{1}{2}\right) }{t}\leqslant C_5\cdot {\mathscr {A}}^2_{u,R} \end{aligned}$$
(6.62)

for all \(t\in (0,{\overline{t}}).\) Now, if \(|xy|< e^{nk/2}\cdot {\mathscr {A}}_{u,R}\cdot {\overline{t}}\), by choosing \(t=\frac{|xy|}{{\mathscr {A}}_{u,R}\cdot e^{nk/2}}\) in (6.62), we have

$$\begin{aligned} \frac{d_Y\big (u(x),u(y)\big )}{|xy|} \leqslant \Big (C_5+\frac{1}{2}\Big )\cdot e^{-nk/2} {\mathscr {A}}_{u,R}:=C_6\cdot {\mathscr {A}}_{u,R}. \end{aligned}$$
(6.63)

At last, let \(x,y\in B_q(R/16).\) If \(|xy|< e^{nk/2}\cdot {\mathscr {A}}_{u,R}\cdot {\overline{t}}\), then (6.63) holds. If \(|xy|\geqslant e^{nk/2}\cdot {\mathscr {A}}_{u,R}\cdot {\overline{t}}\), we can take some minimal geodesic \(\gamma \) between x and y. The triangle inequality implies that \(\gamma \subset B_q(R/8)\). By choosing points \(x_1,x_2,\ldots , x_{N+1}\) in \(\gamma \) with \(x_1=x,\ x_{N+1}=y\) and \(|x_ix_{i+1}|< e^{nk/2}\cdot {\mathscr {A}}_{u,R}\cdot {\overline{t}}\) for each \(i=1,2,\ldots ,N\) and by using the triangle inequality and (6.63), we have

$$\begin{aligned} d_Y\big (u(x),u(y)\big )\leqslant & {} \sum _{i=1}^{N}d_Y\big (u(x_i),u(x_{i+1})\big ) \leqslant C_6\cdot {\mathscr {A}}_{u,R}\cdot \sum _{i=1}^{N}|x_i x_{i+1}|\\= & {} C_6\cdot {\mathscr {A}}_{u,R}\cdot |xy|. \end{aligned}$$

That is, (6.63) still holds. Therefore the proof of Theorem 1.4 is complete. \(\square \)