1 Introduction

A family \(\{M_t\}_{t\ge 0}\) of hypersurfaces in \({\mathbb R}^n\) is called mean curvature flow (MCF) if the velocity vector v of \(M_t\) is equal to its mean curvature vector h at each point and time, that is,

$$\begin{aligned} v=h\ \quad \text{ on } \;M_t. \end{aligned}$$
(1.1)

As one of the fundamental geometric evolution problems, the MCF has been studied by numerous researchers in the past few decades. One of many facets of investigations is the time-global existence question of such a family when given an initial hypersurface \(M_0\). In general dimensions, there exists a unique smooth family of MCF for finite time until singularities such as vanishing and pinch-off occur. Though the classical MCF ceases to exist at this point, it is well-known that a unique time-global solution \(\{M_t\}_{t\ge 0}\) exists in a weak viscosity sense [11, 16] despite the occurrence of singularities.

In this paper, we are interested in an aspect of time-global existence theory for a related problem, and the question we ask is the following. Given an initial hypersurface \(M_0\) and a vector field u, is there a family \(\{M_t\}_{t\ge 0}\) of hypersurfaces whose velocity vector v is equal to its mean curvature h plus u? What is the minimum regularity assumption on u for the existence and regularity of such a family? To be more precise, since we would be interested in the normal velocity to see the motion, the requirement is

$$\begin{aligned} v=h+(u\cdot \nu )\nu \quad \text{ on } \;M_t \end{aligned}$$
(1.2)

where \(\nu \) is the unit normal vector field of \(M_t\) and \(\ \cdot \ \) is the inner product in \({\mathbb R}^n\). Motivation to investigate (1.2) is more than just to see what happens when an extra lower order term is added. While the MCF is of premier importance, one wonders what is the limit of applicability of various analytic techniques developed for the MCF if one puts a wild perturbation. In a reverse context, if one understands the limit of generality of the MCF, then some of the analytic techniques developed for more general settings may be useful for the MCF. In fact, our investigation on (1.2) has already led us to the development of a local regularity theory [30, 46] which gives new insight to the MCF. Physically, one may regard (1.2) as a surface tension driven phase boundary motion with a given background transport effect such as fluid flow or external force field. One can also find such motion law in a coupled system with the Navier-Stokes equation modeling a flow of dry foam (see, for example, [31] for the numerical simulation and references therein).

Though far from complete, in this paper we obtain satisfactory time-global existence and regularity theorems if we assume that \(M_0\) is \(C^1\) and u satisfies

$$\begin{aligned} \left( \int _0^T\left( \int _{{\mathbb R}^n} |u(x,t)|^p+|\nabla u(x,t)|^p\, dx\right) ^{\frac{q}{p}}\, dt\right) ^{\frac{1}{q}}<\infty \end{aligned}$$
(1.3)

for all \(T<\infty \), with \(2<q<\infty \) and \(\frac{nq}{2(q-1)}<p<\infty \) (\(\frac{4}{3}\le p\) in addition if \(n=2\)). Here \(\nabla u=(\partial _{x_1}u,\ldots ,\partial _{x_n}u)\) is the weak partial derivatives and \(u,\nabla u\) are measurable with the stated integrability. We prove that the hypersurfaces remain \(C^1\) at least for a short time, and it is a.e. \(C^1\) away from a region where \(M_t\) develops higher multiplicities. With more regularity assumption on u such as Hölder continuity, we have \(C^2\) instead of \(C^1\) and (1.2) is satisfied classically. For the precise statement of the regularity, see Theorem 2.3.

Here we briefly discuss our approach. If u is regular enough with respect to x, for example Lipschitz continuous, the level set method approach works well with a good order preserving property (see, for example, [22] and [20, Sect. 4.8]). Also for regular enough u, there are a number of short time existence results which are often stated for the MCF but which can be extended to include regular u: (1) solving an evolution equation for the height function from the reference initial manifold [10], (2) solving equations for signed distance function [17] (and elaborated further in [21]), and (3) constructing an approximate solution by time-discrete minimal movement [3], just to name a few examples. On the other hand, with irregular u, one can not expect the order preserving property in general and even the short time existence of solution can be a serious issue. Hence to characterize (1.2), we take an approach pioneered by Brakke [6] using the notion of varifold from geometric measure theory. To construct a sequence of approximate solutions, we use the Allen-Cahn equation [2] with an extra transport term coming from u, (3.5). Much of the analysis of the present paper concerns various \(\varepsilon \)-independent estimates of quantities associated with \(\varphi _{\varepsilon }\). We obtain a desired solution by taking a limit \(\varepsilon \rightarrow 0\). Thus the interest of the present paper can be also the analysis of (3.5) itself. Once we verify that the limit satisfies (1.2) in a weak sense of varifold as in Brakke’s formulation, we apply a local regularity theory developed in [30, 46] which is tailor-made for the present problem. To our knowledge, under the assumption (1.3) of u, even the short time existence of \(C^1\) solution seems new.

As for the MCF in general, there are a number of books and papers some of which include up-to-date research results on the subject and we mention [4, 5, 12, 14, 20, 35, 48]. Concerning a time-global existence for the MCF and the related problems, we mention [3, 6, 11, 16, 29, 34] and references therein. While there are numerous works with varying generalities establishing the connection between the Allen-Cahn equation and the MCF (for example [7, 9, 13, 15, 19, 39]), analysis of the Allen-Cahn equation using geometric measure theory was pioneered by Ilmanen [28] in which he proved that the limit surface measures are rectifiable and satisfy (1.1) in the sense of Brakke’s formulation. The second author proved that the limit surface measures are integral [45]. There are a number of closely related works even if we restrict the scope within some measure theoretic approach to the Allen-Cahn equation, and we further mention [37, 40, 42, 43] and references therein. The existence result of the present paper has been proved by Liu et al. [33] for \(n=2,3\) and with more restrictive assumptions on p and q. The limitation of the dimensions was due to the use of results by Röger and Schätzle [38], which gives a characterization of limit measures under an assumption of uniform \(L^2\) bound of mean curvature-like quantity. In the present paper, we avoid using [38], and we follow the line of proofs of [28, 45] combined with various estimates from [33]. This frees us from any dimensional restriction. As a special case, the first author investigated the graph-like problem of (1.2) with a better regularity assumption on u and showed a unique short time existence [44].

The paper is organized as follows. In Sect. 2 we set our notations and explain the main results. In Sect. 3 we briefly discuss some heuristic aspects of the Allen-Cahn equation. Section 4 deals with the uniform upper density ratio bound and monotonicity formula, and this is the key to control the transport term subsequently. In Sect. 5, we show that there exists a limit surface measure for all \(t\ge 0\). Section 6 proves that the limit measure is rectifiable and this part owes much to Ilmanen’s work [28]. In Sect. 7, we prove that the limit measure has integer density modulo surface energy constant. There, the idea of proof goes back to [27] and the parabolic version [45]. In Sect. 8 we prove the main results by combining all the results from previous four sections. We record our final remarks in the last Sect. 9. We intended the paper to be as self-contained as possible, only exception being the proof for regularity. There we cite the main local regularity theorem which has a set of assumptions we need to check.

2 Preliminaries and main results

2.1 Basic notation

Let \({\mathbb N}\) be the set of natural numbers and \({\mathbb R}^+:=\{x\ge 0\}\). For \(0<r<\infty \) and \(a\in {\mathbb R}^k\) define

$$\begin{aligned} B_r^k(a):=\{x\in {\mathbb R}^k\,:\, |x-a|<r\}. \end{aligned}$$

We write \(B_r^k:=B_r^k(0)\). When \(k=n\), we omit writing n. We often identify \({\mathbb R}^{n-1}\) with \({\mathbb R}^{n-1}\times \{0\}\subset {\mathbb R}^n\). On \({\mathbb R}^n\) we denote the Lebesgue measure by \({\mathcal L}^n\) and for \(0\le k\le n\), the k-dimensional Hausdorff measure by \({\mathcal H}^{k}\). Define \(\omega _n:={\mathcal L}^n(B_1)\). Given a set \(A\subset {\mathbb R}^n\) and a measure \(\mu \), the restriction of \(\mu \) to A is denoted by \(\mu \lfloor _A\). The characteristic function of A is denoted by \(\chi _A\). Symbol \(\nabla \) always refers to a differentiation with respect to the space variables. For a set of finite perimeter (see [24] for the definition) A, we denote the total variation measure of the distributional derivative \(\nabla \chi _A\) by \(\Vert \nabla \chi _A\Vert \).

Throughout the paper, we set \(\Omega \) to be either \({\mathbb T}^n\), the n-dimensional unit torus, or \({\mathbb R}^n\). For \(\Omega ={\mathbb T}^n\) we often regard \(\Omega \) as the unit square \([0,1)\times \cdots \times [0,1)\subset {\mathbb R}^n\) where all the relevant quantities are extended periodically to the entire \({\mathbb R}^n\). Objects such as functions and sets in \(\Omega \) are understood implicitly in this manner. For any Radon measure \(\mu \) on \({\mathbb R}^n\) and \(\phi \in C_c({\mathbb R}^n)\) we often write \(\mu (\phi )\) for \(\int \phi \, d\mu \). We write \(\mathrm{spt}\,\mu \) for the support of \(\mu \). Thus \(x\in \mathrm{spt}\,\mu \) if \(\forall r>0\), \(\mu (B_r(x))>0\). For \(1\le p\le \infty \), we write \(f\in L^p(\mu )\) if f is \(\mu \) measurable and \((\int |f|^p\, d\mu )^{1/p}<\infty \). We use the standard notation for Sobolev spaces such as \(W^{1,p}(\Omega )\) and \(W^{1,p}_{loc}(\Omega )\) from [23].

For \(A,B\in \mathrm{Hom}({\mathbb R}^n;{\mathbb R}^n)\) which we identify with \(n\times n\) matrices, we define

$$\begin{aligned} A\cdot B:=\sum _{i,j}A_{ij}B_{ij} \quad \text {and} \quad |A|:=\sqrt{A\cdot A}. \end{aligned}$$

\(\Vert A\Vert \) denotes the operator norm. The identity of \(\mathrm{Hom}({\mathbb R}^n;{\mathbb R}^n)\) is denoted by I. For \(k\in {\mathbb N}\) with \(k<n\), let \(\mathbf{G}(n,k)\) be the space of k-dimensional subspaces of \({\mathbb R}^n\). The orthogonal complement of \(S\in \mathbf{G}(n,k)\) is denoted by \(S^{\perp }\in \mathbf{G}(n,n-k)\). For \(a\in \mathbb {R}^n\), \(a\otimes a \in \mathrm{Hom}({\mathbb R}^n;{\mathbb R}^n)\) is the matrix with the entries \(a_ia_j\) (\(1\le i,j\le n\)). For \(S\in \mathbf{G}(n,k)\), we identify S with the corresponding orthogonal projection of \({\mathbb R}^n\) onto S. In the case of \(k=n-1\), we also identify \(S\in \mathbf{G}(n,n-1)\) with the unit vector \(\pm \nu \in {\mathbb S}^{n-1}\) which is perpendicular to S. Note that we may express the relation by \(S=I-\nu \otimes \nu \). The correspondence is a homeomorphism with respect to the naturally endowed topologies on \(\mathbf{G}(n,n-1)\) and \({\mathbb S}^{n-1}/\{\pm 1\}\). For \(x,y\in {\mathbb R}^n\) and \(t<s\) define

$$\begin{aligned} \rho _{(y,s)} (x,t) := \frac{1}{(4\pi (s-t))^{\frac{n-1}{2}}} e^{-\frac{|x-y|^2}{4(s-t)}}, \end{aligned}$$
(2.1)

which is the backward heat kernel with pole at (ys).

2.2 Varifolds

We recall some definitions from geometric measure theory and refer to [1, 6, 41] for more details. For any open set \(U\subset {\mathbb R}^n\) let \(G_{k}(U):=U\times \mathbf{G}(n,k)\). A general k-varifold in U is a Radon measure on \(G_k(U)\). We denote the set of all general k-varifolds in U by \(\mathbf {V}_k(U)\). For \(V\in \mathbf {V}_k(U)\), let \(\Vert V\Vert \) be the weight measure of V, namely,

$$\begin{aligned} \Vert V\Vert (\phi ):=\int _{G_k(U)} \phi (x)\, dV(x,S),\quad \forall \, \phi \in C_c(U). \end{aligned}$$

We say \(V\in \mathbf {V}_k(U)\) is rectifiable if there exist a \({\mathcal H}^k\) measurable countably k-rectifiable set \(M\subset U\) and a locally \({\mathcal H}^k\) integrable function \(\theta \) defined on M such that

$$\begin{aligned} V(\phi )=\int _{M} \phi (x,\mathrm{Tan}_x M)\theta (x)\, d{\mathcal H}^k \end{aligned}$$
(2.2)

for \(\phi \in C_c (G_k(U))\). Here \(\mathrm{Tan}_x M\) is the approximate tangent space of M at x which exists \({\mathcal H}^k\) a.e. on M. Rectifiable k-varifold is uniquely determined by its weight measure \(\Vert V\Vert =\theta \,{\mathcal H}^{n-1}\lfloor _{M}\) through the formula (2.2). For this reason, we naturally say a Radon measure \(\mu \) on U is rectifiable when one can associate a rectifiable varifold V such that \(\Vert V\Vert =\mu \). If \(\theta \in {\mathbb N}\), \({\mathcal H}^k\) a.e. on M, we say V is integral. The set of all integral k-varifolds in U is denoted by \(\mathbf{IV}_k(U)\). If \(\theta =1\), \({\mathcal H}^k\) a.e. on M, we say V is a unit density k-varifold.

For \(V\in \mathbf {V}_k(U)\) let \(\delta V\) be the first variation of V, namely,

$$\begin{aligned} \delta V(g):= \int _{G_k(U)} \nabla g(x) \cdot S \, dV(x,S) \end{aligned}$$
(2.3)

for \(g\in C_c^1(U\,;\,{\mathbb R}^n)\). If the total variation \(\Vert \delta V\Vert \) of \(\delta V\) is locally bounded and absolutely continuous with respect to \(\Vert V\Vert \), by the Radon-Nikodym theorem, we have a \(\Vert V\Vert \) measurable vector field \(h(V,\cdot )\) with

$$\begin{aligned} \delta V(g)=-\int _U g(x)\cdot h(V,x)\, d\Vert V\Vert (x). \end{aligned}$$
(2.4)

The vector field \(h(V,\cdot )\) is called the generalized mean curvature vector of V. For any \(V\in \mathbf{IV}_k(U)\) with an integrable \(h(V,\cdot )\), Brakke’s perpendicularity theorem [6, Chapter 5] says that we have

$$\begin{aligned} \int _U (\mathrm{Tan}_x M)^{\perp }(g(x))\cdot h(V,x)\, d\Vert V\Vert (x)= \int _U g(x)\cdot h(V,x)\, d\Vert V\Vert (x) \end{aligned}$$
(2.5)

for all \(g\in C_c(U;{\mathbb R}^n)\). Here, M is related to V as in (2.2). In the case of \(k=n-1\), note that \((\mathrm{Tan}_x M)^{\perp }=\nu (x)\otimes \nu (x)\) for \(\Vert V\Vert \) a.e. in U, where \(\nu (x)\) is the unit normal vector to \(\mathrm{Tan}_x M\). With this notation (2.5) may be written as

$$\begin{aligned} \int _U (g(x)\cdot \nu (x))(h(V,x)\cdot \nu (x))\, d\Vert V\Vert (x)=\int _U g(x)\cdot h(V,x)\, d\Vert V\Vert (x) \end{aligned}$$
(2.6)

for \(g\in C_c(U;{\mathbb R}^n)\). If \(h(V,\cdot )\in L^2(\Vert V\Vert )\), by approximation, (2.6) holds even for \(g\in L^2(\Vert V\Vert )\).

2.3 Weak formulation of velocity

Let \(\{M_t\}_{t\ge 0}\) be a family of smooth hypersurfaces in \(\Omega \) whose normal velocity is denoted by v. To formulate the velocity in a weak sense, observe the following characterization of v: a smooth normal vector field \(\tilde{v}\) on \(M_t\) is equal to v if and only if

$$\begin{aligned} \frac{d}{dt}\int _{M_t} \phi \, d{\mathcal H}^{n-1}\le \int _{M_t} (\nabla \phi -h\phi )\cdot \tilde{v}+\partial _t \phi \, d{\mathcal H}^{n-1} \end{aligned}$$
(2.7)

holds for all \(\phi \in C_c^1(\Omega \times [0,\infty );{\mathbb R}^+)\) and for all \(t\ge 0\). Here h is the classical mean curvature vector of \(M_t\). To check this claim, after some calculation, one first sees that v satisfies (2.7) with equality. Conversely, if \(\tilde{v}\) satisfies (2.7), and already knowing that v satisfies (2.7) with equality, we obtain

$$\begin{aligned} 0\le \int _{M_t}(\nabla \phi -h\phi )\cdot (\tilde{v} -v)\, d{\mathcal H}^{n-1} \end{aligned}$$

for \(\phi \in C_c^1(\Omega ;{\mathbb R}^+)\). For any \(\hat{x}\in M_t\) and \(\lambda >0\), let \(\phi _{\lambda } (y):=\lambda ^{2-n}\phi (\lambda ^{-1}(y-\hat{x}))\). Substitute \(\phi _{\lambda }\) and let \(\lambda \downarrow 0\). Since \(\lambda ^{-1}(M_t-\hat{x})\rightarrow \mathrm{Tan}_{\hat{x}} M_t\), we obtain

$$\begin{aligned} 0\le \int _{\mathrm{Tan}_{\hat{x}} M_t} \nabla \phi \, d{\mathcal H}^{n-1} \cdot (\tilde{v} (\hat{x})-v (\hat{x})). \end{aligned}$$

The integration by parts shows \(\int _{\mathrm{Tan}_{\hat{x}} M_t} \nabla \phi \, d{\mathcal H}^{n-1}\perp \mathrm{Tan}_{\hat{x}} M_t\). On the other hand, one may choose this vector to be \(-(\tilde{v} (\hat{x})-v(\hat{x}))\), for example. Thus we have \(\tilde{v} (\hat{x}) =v(\hat{x})\) and we complete the proof of the claim. The characterization (2.7) motivates the following definition.

Definition 2.1

A family of varifolds \(\{V_t\}_{t\ge 0}\subset \mathbf{V}_{n-1}(\Omega )\) is a generalized solution of (1.2) if the following four conditions are satisfied.

  1. (a)

    \(V_t\in \mathbf{IV}_{n-1}(\Omega )\) for a.e. \(t\ge 0\).

  2. (b)

    For all \(T>0\),

    $$\begin{aligned} \sup _{t\in [0,T]} \Vert V_t\Vert (\Omega )<\infty \ \ \text{ and } \sup _{t\in [0,T],\, B_r(x)\subset \Omega }\frac{\Vert V_t\Vert (B_r(x))}{\omega _{n-1} r^{n-1}}<\infty . \end{aligned}$$
    (2.8)
  3. (c)

    For all \(T>0\),

    $$\begin{aligned} \int _0^T dt \int _{\Omega } |h|^2+|u|^2 \, d\Vert V_t\Vert <\infty . \end{aligned}$$
    (2.9)
  4. (d)

    For all \(\phi \in C^1_c(\Omega \times [0,\infty ) ; {\mathbb R}^+)\) and \(0\le t_1<t_2<\infty \),

    $$\begin{aligned} \Vert V_{t}\Vert (\phi (\cdot ,t))\Big |_{t=t_1}^{t_2}\le \int _{t_1}^{t_2}dt\int _{\Omega } (\nabla \phi -h\phi )\cdot \{h+ (u\cdot \nu )\nu \}+\partial _t\phi \, d\Vert V_t\Vert \nonumber \\ \end{aligned}$$
    (2.10)

    holds, where we abbreviated \(h(V_t,x)\) by h.

The condition (b) may appear out of place in the definition of velocity. In fact, if u is 0 or a bounded function and if \(\Vert V_0\Vert \) satisfies (2.8), one can derive (2.8) as a consequence of (2.10) via Huisken’s monotonicity formula. However, if u is not bounded, it is not clear how to obtain (2.8) from (2.10). The other important point is that, unless one has (2.8), it is unclear how to make sense of (2.9) and (2.10). The difficulty is, \(u(\cdot ,t)\) needs to be defined as a \(\Vert V_t\Vert \) measurable function for a.e. \(t\ge 0\). In general, \(u(\cdot ,t)\) is assumed to be in some Sobolev space on \(\Omega \), and we need to define \(\Vert V_t\Vert \) measurable \(u(\cdot ,t)\) as a trace function. If we have (2.8), we may define the trace using the following inequality.

Theorem 2.1

For a Radon measure \(\mu \) on \(\mathbb {R}^n\) with \(D:= \sup _{B_r(x)\subset {\mathbb R}^n}\frac{\mu (B_r(x))}{\omega _{n-1} r^{n-1}}\) and \(1\le p <n\),

$$\begin{aligned} \int _{{\mathbb R}^n}|\phi |^{\frac{p(n-1)}{n-p}}\, d\mu \le c(n,p) D\left( \int _{\mathbb {R}^n} |\nabla \phi |^p \, dx \right) ^{\frac{n-1}{n-p}} \end{aligned}$$
(2.11)

holds for \(\phi \in C^1 _c (\mathbb {R}^n)\).

See [36] and [49] for the proof in the case of \(p=1\). The above inequality for \(1< p<n\) may be derived by the Hölder and Sobolev inequalities.

Suppose that we have (2.8). We only need to define u as a function in \(L_{loc}^2(\Vert V_t\Vert \times dt)\) to make sense of (2.9) and (2.10). Since \(W^{1,p^{\prime }}_{loc}\subset W^{1,p}_{loc}\) if \(p^{\prime }>p\), we need to consider only \(1\le p<n\). Using the Hölder inequality and (2.11), we obtain (with \(D:= \sup _{B_r(x)\subset \Omega }\frac{\Vert V_t\Vert (B_r(x))}{\omega _{n-1} r^{n-1}}\))

$$\begin{aligned} \int _{\Omega } |\phi | ^2 \, d\Vert V_t\Vert\le & {} \left( \int _{\Omega } |\phi |^{\frac{p(n-1)}{n-p}}\, d\Vert V_t\Vert \right) ^{\frac{2(n-p)}{p(n-1)}} (\Vert V_t\Vert (\mathrm{spt}\,\phi ))^{\frac{pn+p-2n}{p(n-1)}} \nonumber \\\le & {} (c(n,p) D)^{\frac{2(n-p)}{p(n-1)}} \left( \int _{\Omega } |\nabla \phi |^p\, dx\right) ^{\frac{2}{p}}(\Vert V_t\Vert (\mathrm{spt}\,\phi ))^{\frac{pn+p-2n}{p(n-1)}}.\qquad \qquad \end{aligned}$$
(2.12)

for \(\phi \in C^1_c(\Omega )\). Here, we also need to assume that

$$\begin{aligned} p\ge \frac{2n}{n+1} \end{aligned}$$
(2.13)

so that \(\frac{p(n-1)}{n-p}\ge 2\). Since we will assume (2.14) in the next subsection, which implies \(p>\frac{n}{2}\) in particular, (2.13) will be relevant only for \(n=2\) and we will assume \(p\ge \frac{4}{3}\) when \(n=2\). With this restriction, we may define u as an \(L^2_{loc}(\Vert V_t\Vert \times dt)\) function on \(\Omega \times [0,T]\) uniquely as long as \(u\in L^2_{loc}([0,\infty );( W_{loc}^{1,p} (\Omega ))^n)\) by the standard density argument. The function u in (2.9) and (2.10) is defined in this sense.

2.4 Main results

First we present some existence result for (1.2) when given a vector field u and an initial hypersurface \(M_0\).

Theorem 2.2

Suppose \(n\ge 2\),

$$\begin{aligned} 2<q<\infty ,\ \ \frac{nq}{2(q-1)}<p<\infty \ \ \left( \ \frac{4}{3}\le p\; \text{ in } \text{ addition } \text{ if } \;n=2\right) \end{aligned}$$
(2.14)

and \(\Omega ={\mathbb R}^n \text{ or } \mathbb {T}^n\). Given any

$$\begin{aligned} u\in L^q_{loc}([0,\infty ) ;(W^{1,p} (\Omega ))^n ) \end{aligned}$$
(2.15)

and a non-empty bounded domain \(\Omega _0 \subset \Omega \) with \(C^1\) boundary \(M_0=\partial \Omega _0\), there exist

  1. (1)

    a family of varifolds \(\{V_t\}_{t\ge 0}\subset \mathbf{V}_{n-1}(\Omega )\) which is a generalized solution of (1.2) as in Definition 2.1 with \(\Vert V_0\Vert ={\mathcal H}^{n-1}\lfloor _{M_0}\) and

  2. (2)

    a function \(\varphi \in BV _{loc} (\Omega \times [0,\infty )) \cap C^{\frac{1}{2}} _{loc} ([0,\infty );L^1 (\Omega ))\) with the following properties.

    1. (2a)

      \(\varphi (\cdot ,t)\) is a characteristic function for all \(t\in [0,\infty )\),

    2. (2b)

      \(\Vert \nabla \varphi (\cdot ,t)\Vert (\phi )\le \Vert V_t\Vert (\phi )\) for all \(t\in [0,\infty )\) and \(\phi \in C_c(\Omega ;{\mathbb R}^+)\),

    3. (2c)

      \(\varphi (\cdot ,0) = \chi _{\Omega _0}\) a.e. on \(\Omega \),

    4. (2d)

      writing \(\Vert V_t\Vert =\theta {\mathcal H}^{n-1}\lfloor _{M_t}\) and \(\Vert \nabla \varphi (\cdot ,t)\Vert ={\mathcal H}^{n-1}\lfloor _{\tilde{M}_t}\) for a.e. \(t>0\), we have

      $$\begin{aligned} {\mathcal H}^{n-1}(\tilde{M}_t{\setminus } M_t)=0 \end{aligned}$$
      (2.16)

      and

      $$\begin{aligned} \theta (x,t)=\left\{ \begin{array}{ll} \text{ even } \text{ integer } \ge 2 &{}\quad \text{ if } \, x\in M_t{\setminus } \tilde{M}_t, \\ \text{ odd } \text{ integer } \ge 1 &{}\quad \text{ if } \, x\in \tilde{M}_t \end{array} \right. \end{aligned}$$
      (2.17)

      for \({\mathcal H}^{n-1}\) a.e. \(x\in M_t\).

  3. (3)

    If \(p<n\), then for any \(T>0\), setting \(s:=\frac{p(n-1)}{n-p}\), we have

    $$\begin{aligned} \left( \int _0^T \left( \int _{\Omega } |u|^{s}\, d\Vert V_t\Vert \right) ^{\frac{q}{s}}\, dt\right) ^{\frac{1}{q}}<\infty . \end{aligned}$$
    (2.18)

    If \(p=n\), then we have (2.18) locally for \(U\subset \subset \Omega \) for any \(2\le s<\infty \) and if \(p>n\), then we have (2.18) with \(L^s\) norm replaced by \(C^{1-\frac{n}{p}}\) norm on \(\Omega \).

  4. (4)

    There exists \(T_1>0\) such that \(V_t\) has unit density for a.e. \(t\in [0,T_1)\). In addition \(\Vert \nabla \varphi (\cdot ,t) \Vert = \Vert V_t\Vert \) for a.e. \(t\in [0,T_1)\).

The condition (2.14) on u is a dimensionally sharp condition in the following sense. Consider a natural parabolic change of variables \(\tilde{x}:=\frac{x}{\lambda }\) and \(\tilde{t}:=\frac{t}{\lambda ^2}\) with \(\lambda >0\). Since u is a velocity field, it should behave just like x / t, thus it is natural to consider \(\tilde{u}:=\lambda u\). Then we have

$$\begin{aligned} \left( \int _{0}^{\infty }\left( \int _{{\mathbb R}^n}|\nabla u|^p\, dx\right) ^{\frac{q}{p}}\, dt\right) ^{\frac{1}{q}}=\lambda ^{ \frac{n}{p}+\frac{2}{q}-2} \left( \int _{0}^{\infty } \left( \int _{{\mathbb R}^n}|\nabla \tilde{u}|^p\, d\tilde{x}\right) ^{\frac{q}{p}}\, d\tilde{t}\right) ^{\frac{1}{q}} \end{aligned}$$

and \(\frac{n}{p}+\frac{2}{q}-2<0\) is equivalent to the second inequality in (2.14). This guarantees that u locally behaves more like a perturbative term. In (3), if \(p>n\), then the result follows from the standard Sobolev inequality on \({\mathbb R}^n\).

To understand what \(V_t\) and \(\varphi \) are, assume for a moment that no singular behaviors occur and we have a smooth family \(\{M_t\}_{t\ge 0}\) with the velocity given by (1.2). Then we should have \(\mathrm{spt}\, \Vert V_t\Vert =\partial \{\varphi (\cdot ,t)=1\}=M_t\). Since (1.2) is stated in terms of \(V_t\), it may first appear that \(\varphi \) is redundant. However, beside the fact that \(\varphi \) is obtained naturally from the approach of the present paper, it has a few important roles. First, \(\varphi \) helps to guarantee that \(V_t\) is non-trivial. Since \(\varphi (\cdot ,t)\) is continuous in \(L^1(\Omega )\) by (2), \(\Vert \varphi (\cdot ,t)\Vert _{L^1(\Omega )}\) cannot vanish instantaneously at some arbitrary time. As long as \(\varphi (\cdot ,t)\) is not identically zero or identically 1, \(\Vert V_t\Vert \) is non-zero measure. Note that, given arbitrary \(t_0>0\), by re-defining \(V_t:= 0\) for all \(t>t_0\), we obtain another generalized solution of (1.2) due to the inequality in (2.10). Obviously, this is not a solution we would like to obtain in the end. The second role of \(\varphi \) is that it gives some restriction on the possible singularities of \(\mathrm{spt}\,\Vert V_t\Vert \). For example, consider in the \(n=2\) case. One can see that a unit density \(V_t\) cannot form a triple junction since \(\partial \{\varphi (\cdot ,t)=1\}\) cannot be a triple junction. Thus, having \(\varphi \) as an auxiliary object may be a useful tool to obtain some better regularity results. As for the actual occurrence of the higher multiplicities, Bronsard and Stoth [8] showed that one can have solution with \(\theta \ge 2\) for a limit of the Allen-Cahn equation, thus we may indeed have such solution in general.

We next state the regularity property of \(\mathrm{spt}\,\Vert V_t\Vert \), which is obtained as an application of [30, 46]. To state the result, we recall some definitions from there.

Definition 2.2

A point \(x\in \mathrm{spt}\,\Vert V_t\Vert \) is said to be a \(C^{1,\zeta }\) regular point if there exists some open neighborhood O in \({\mathbb R}^{n+1}\) containing (xt) such that \(O\cap \cup _{s>0} (\mathrm{spt}\,\Vert V_s\Vert \times \{s\})\) is an embedded n-dimensional manifold with \(C^{1,\zeta }\) regularity in space and \(C^{(1+\zeta )/2}\) regularity in time. Similarly, we define a \(C^{2,\alpha }\) regular point by replacing the respective regularities by \(C^{2,\alpha }\) in space and \(C^{1,\alpha /2}\) in time.

Theorem 2.3

Let \(\{V_t\}_{t\ge 0}\) be as in Theorem 2.2 .

  1. (1)

    Suppose that there exist an open set \(U\subset \Omega \) and an interval \((t_1,t_2)\) such that \(V_t\) is unit density in U for a.e. \(t\in (t_1,t_2)\). Then for a.e. \(t\in (t_1,t_2)\), there exists a closed set \(G_t\subset U\) with \({\mathcal H}^{n-1}(G_t)=0\) such that \((U\cap \mathrm{spt}\, \Vert V_t\Vert ){\setminus } G_t\) is a set of \(C^{1,\zeta }\) regular points where \(\zeta := 2-\frac{n}{p}-\frac{2}{q}\) if \(p<n\). If \(p\ge n\), one may take any \(\zeta \) with \(0<\zeta <1-\frac{2}{q}\).

  2. (2)

    There exists \(T_2>0\) such that every point of \(\mathrm{spt}\, \Vert V_t\Vert \) is a \(C^{1,\zeta }\) regular point for all \(t\in (0,T_2)\) (that is, \(G_t=\emptyset \)), where \(\zeta \) is as in (1).

  3. (3)

    If u is Hölder continuous with exponent \(\alpha \) in the parabolic sense, i.e.,

    $$\begin{aligned} \sup _{\Omega \times [0,T]}|u|+\sup _{x,y\in \Omega , 0\le t_1<t_2\le T}\frac{|u(x,t_1)-u(y,t_2)|}{\max \{ |x-y|^{\alpha }, |t_1-t_2|^{\alpha /2}\}}<\infty \end{aligned}$$

    for all \(0<T<\infty \), then the same results for (1) and (2) hold true with \(C^{1,\zeta }\) there replaced by \(C^{2,\alpha }\) and (1.2) is satisfied pointwise.

  4. (4)

    We have \(\lim _{t\downarrow 0} t^{-\frac{1}{2}} \mathrm{dist}\, (M_0, \mathrm{spt}\, \Vert V_t\Vert )=0\) and \(\mathrm{spt}\, \Vert V_t\Vert \) converges to \(M_0\) in \(C^1\) topology as \(t\downarrow 0\). Namely, given \(\varepsilon >0\) there exists a finite number of sets \(\{U_i=x_i+O_i(B_r ^{n-1}\times (-r,r))\}_{i=1}^N\), where \(O_i\) is an orthogonal rotation and \(x_i\in M_0\), such that \(M_0\subset \cup _{i=1}^N U_i\), and \(C^1\) norms of difference of graphs representing \(M_0\) and \(\mathrm{spt}\, \Vert V_t\Vert \) over \(x_i+O_i(B_r^{n-1})\) in \(U_i\) are less than \(\varepsilon \) for all sufficiently small \(t>0\).

The claim (1) says that wherever \(V_t\) is unit density in some space-time neighborhood, \(\mathrm{spt}\,\Vert V_t\Vert \) is locally a hypersurface with regularity of \(C^{1,\zeta }\) in space and \(C^{(1+\zeta )/2}\) in time, almost everywhere in space and time. We can guarantee by (2) that there is some time interval \([0,T_2)\) such that \(\mathrm{spt}\,\Vert V_t\Vert \) is a \(C^{1,\zeta }\) hypersurface. We obtain a lower bound on \(T_2\) in terms of \(M_0\) and the norm of u. On the other hand, \(T_2\) may be much larger than the lower bound and it is the time when a non-\(C^{1,\zeta }\) regular point occurs for the first time. In general, \(T_2\le T_1\) and it is plausible that some non-\(C^{1,\zeta }\) regular point first appears at \(T_2\) but \(V_t\) may remain unit density for some more time. The claim (4) shows that \(\mathrm{spt}\,\Vert V_t\Vert \) has \(C^1\) uniform regularity and convergence as \(t\downarrow 0\). As for (3), we first note that we can show the same existence results for Hölder continuous u (and not in \(L^q_{loc}([0,\infty );(W^{1,p}(\Omega ))^n)\)) as in Theorem 2.2. In fact the proof is simpler if u is bounded. \(C^{2,\alpha }\) regularity allows one to have pointwise mean curvature vector and velocity vector of \(\mathrm{spt}\, \Vert V_t\Vert \) and (1.2) is satisfied pointwise. At this point, we reach a well-defined PDE setting, and \(\mathrm{spt}\,\Vert V_t\Vert \) is as regular as what the standard parabolic regularity theory shows depending on any additional regularity assumption imposed on u.

3 Allen-Cahn equation with transport term

As stated in the introduction, the method of proof for the existence is to approximate (1.2) by the Allen-Cahn equation with an extra transport term coming from u. Throughout the paper, we assume that a function W satisfies the following:

$$\begin{aligned} W:\mathbb {R}\rightarrow [0,\infty ) \text{ is } C^3\quad \text{ and } \quad W(\pm 1)=W^{\prime }(\pm 1)=0. \end{aligned}$$
(3.1)
$$\begin{aligned} \text{ For } \text{ some } \gamma \in (-1,1), W^{\prime }<0\; \text{ on } \;(\gamma ,1)\quad \text{ and } \quad W^{\prime }>0\; \text{ on } \;(-1,\gamma ). \end{aligned}$$
(3.2)
$$\begin{aligned} \text{ For } \text{ some } \alpha \in (0,1)\quad \text{ and } \quad \kappa >0, W^{\prime \prime }(x)\ge \kappa \; \text{ for } \text{ all } \,1\ge |x|\ge \alpha . \end{aligned}$$
(3.3)

We also define a constant

$$\begin{aligned} \sigma :=\int _{-1}^1 \sqrt{2W(s)}\, ds. \end{aligned}$$
(3.4)

Basically, above assumptions require W to be W-shaped with non-degenerate two minima at \(\pm 1\). Requiring (3.2) may appear non-essential, but it is used essentially in deriving an upper bound for \(\xi _{\varepsilon }\) in Lemma 4.2. Any such W satisfying above can be used. The reader can take a concrete example such as \(W(s)=(1-s^2)^2\) in the following.

Given u and \(M_0\) as in Theorem 2.2, the whole scheme of the present paper is to approximate the motion law (1.2) by

$$\begin{aligned} \partial _t \varphi _{\varepsilon } + u_{\varepsilon }\cdot \nabla \varphi _{\varepsilon }=\Delta \varphi _{\varepsilon }-\frac{W^{\prime }(\varphi _{\varepsilon })}{\varepsilon ^2}, \end{aligned}$$
(3.5)

where \(\varepsilon >0\) is a small parameter tending to 0 and \(u_{\varepsilon }\) is a smooth approximation of u. For readers who are not familiar with the Allen-Cahn equation, we give a quick heuristic argument. Assume that u is smooth and that we have a family of domains \(\Omega _t\) with smooth boundaries \(M_t=\partial \Omega _t\). Let \(d(\cdot ,t)\) be the signed distance function to \(M_t\) so that \(d(\cdot ,t)>0\) inside of \(\Omega _t\). We let \(\Psi \,:\, {\mathbb R}\rightarrow (-1,1)\) be an ODE solution of \(\Psi ^{\prime \prime }=W^{\prime }(\Psi )\) with \(\lim _{x\rightarrow \pm \infty }\Psi (x)=\pm 1\). Such solution exists and we may assume \(\Psi (0)=0\). If we postulate that \(\varphi _{\varepsilon }(x,t)\approx \Psi (d(x,t)/\varepsilon )\) and \(\varphi _{\varepsilon }\) satisfies (3.5), then we expect that

$$\begin{aligned} \Psi ^{\prime } \partial _t d+u_{\varepsilon }\cdot \Psi ^{\prime } \nabla d\approx \Psi ^{\prime } \Delta d+\varepsilon ^{-1}(\Psi ^{\prime \prime }|\nabla d|^2-W^{\prime }(\Psi )). \end{aligned}$$
(3.6)

Since d is a distance function, \(|\nabla d|=1\), and the last two terms cancel each other. This leaves

$$\begin{aligned} \partial _t d+u_{\varepsilon }\cdot \nabla d\approx \Delta d. \end{aligned}$$
(3.7)

Due to the nature of the distance function, evaluated on \(M_t\), \(\partial _t d\) is the outward velocity of \(M_t\), \(u_{\varepsilon }\cdot \nabla d\) is the inward normal component of \(u_{\varepsilon }\) and \(\Delta d\) is the mean curvature of \(M_t\). As \(\varepsilon \rightarrow 0\), this approximation may be expected to get better, and the relation (3.7) motivates that \(\{\varphi _{\varepsilon }(\cdot ,t)=0\}\) should converge to \(M_t\) which moves by (1.2). This heuristic argument may be justified if we know in advance that there exists a smooth \(M_t\) moving by (1.2). Here, however, u is not smooth and we aim to obtain a time-global existence result which necessitates a framework inclusive of singularities. This is the reason to use the language of varifold in this paper as was done first by Ilmanen [28]. The basic approach is to prove that \(\varphi _{\varepsilon }\) satisfying (3.5) has the property that

$$\begin{aligned} \mu ^{\varepsilon }:= \left( \frac{\varepsilon |\nabla \varphi _{\varepsilon }|^2}{2}+\frac{W(\varphi _{\varepsilon })}{\varepsilon }\right) \, dx\approx \sigma N(x,t) {\mathcal H}^{n-1}\lfloor _{M_t} \end{aligned}$$
(3.8)

when \(\varepsilon \) is small and where N(xt) is some integer. At the same time we prove that the limiting measure of \(\mu ^{\varepsilon }\) satisfies (2.10). The first key estimate to be established is the analogue of (2.8) for \(\varphi _{\varepsilon }\) which will be discussed in the next section.

4 Density ratio upper bound and energy monotonicity formula

In this section, we prove the upper density ratio bound for diffused interface energy and energy monotonicity formula which are crucial in the limiting process. Estimates in this section are similar to [33, Sect. 3] with some modifications.

4.1 The upper density ratio bound

We state the main theorem concerning the uniform density ratio upper bound independent of \(\varepsilon \) of the Allen-Cahn equation with extra transport term. The proof takes the entire Sect. 4. We establish the monotonicity formula which is a perturbed version of Ilmanen’s monotonicity formula for the Allen-Cahn equation (and Huisken’s monotonicity formula for the MCF [26]) along the way.

Theorem 4.1

Suppose \(n\ge 2\), \(\Omega ={\mathbb T}^n\) or \({\mathbb R}^n\), pq satisfy (2.14),

$$\begin{aligned} 0<\beta <\frac{1}{2}, \end{aligned}$$
(4.1)

\(0<\varepsilon <1\) and \(\varphi \) satisfies

$$\begin{aligned} \partial _t\varphi +u\cdot \nabla \varphi =\Delta \varphi -\frac{W^{\prime }(\varphi )}{\varepsilon ^2}\quad \text{ on } \,\Omega \times [0,T], \end{aligned}$$
(4.2)
$$\begin{aligned} \varphi (x,0)=\varphi _0(x)\quad \text{ on } \,\Omega . \end{aligned}$$
(4.3)

Assume \(u\in C^{\infty }_c(\Omega \times [0,T])\), \(\nabla ^j\varphi ,\, \partial _t\nabla ^k\varphi \in C(\Omega \times [0,T])\) for \(k\in \{0,1\}\) and \(j\in \{0,1,2,3\}\). Let \(\mu _t^{\varepsilon }\) be a Radon measure on \(\Omega \) defined by

$$\begin{aligned} \int _{\Omega }\phi (x)\, d\mu ^{\varepsilon }_t(x):=\int _{\Omega }\phi (x)\left( \frac{\varepsilon |\nabla \varphi (x,t)|^2}{2} +\frac{W(\varphi (x,t))}{\varepsilon }\right) \, dx \end{aligned}$$
(4.4)

for \(\phi \in C_c(\Omega )\) and define

$$\begin{aligned} D(t) :=\max \left\{ 1, \mu _t^{\varepsilon }(\Omega ), \sup _{B_r(x)\subset \Omega } \frac{\mu ^{\varepsilon }_t (B_r (x))}{\omega _{n-1} r^{n-1}}\right\} , \quad t\in [0,T]. \end{aligned}$$
(4.5)

Assume

$$\begin{aligned} \sup _{\Omega \times [0,T]}|\varphi |\le 1, \end{aligned}$$
(4.6)
(4.7)
$$\begin{aligned} \lim _{R\rightarrow \infty } R^k \Vert \varphi +1\Vert _{C^2(({\mathbb R}^n{\setminus } B_R)\times [0,T])}=0 \,\,\textit{for any}\, k\in {\mathbb N}\,\textit{in case}\,\Omega ={\mathbb R}^n, \end{aligned}$$
(4.8)
$$\begin{aligned} \sup _{\Omega } \left( \frac{\varepsilon |\nabla \varphi _0 |^2}{2} -\frac{W(\varphi _0 )}{\varepsilon }\right) \le \varepsilon ^{-\beta }, \end{aligned}$$
(4.9)
$$\begin{aligned} \sup _{\Omega \times [0,T]}|u|\le \varepsilon ^{-\beta }, \sup _{\Omega \times [0,T]}|\nabla u|\le \varepsilon ^{-(\beta +1)}, \end{aligned}$$
(4.10)
(4.11)

and

$$\begin{aligned} D(0)\le D_0. \end{aligned}$$
(4.12)

Then there exist and such that

$$\begin{aligned} \sup _{t\in [0,T]} D(t)\le D_1 \end{aligned}$$
(4.13)

as long as \(\varepsilon <\epsilon _1\).

Remark 4.1

If \(u=0\), \(\mu ^{\varepsilon }_t(\Omega )\) is monotone decreasing, thus it is straightforward to conclude that \(\mu ^{\varepsilon }_t(\Omega )\) is bounded uniformly independent of \(\varepsilon \) if \(\mu ^{\varepsilon }_0(\Omega )\) is. The uniform density ratio bound may be also obtained from Ilmanen’s monotonicity formula. When \(u\ne 0\), however, it is non-trivial even to conclude that the total energy \(\mu ^{\varepsilon }_t(\Omega )\) up to time T has a uniform bound independent of \(\varepsilon \). We will see that we need the density ratio bound to estimate \(\mu ^{\varepsilon }_t(\Omega )\).

4.2 Monotonicity formula

In this subsection as a first step we obtain a modified monotonicity formula analogous to that of Ilmanen [28]. It is still not a very useful formula due to the possible negative contribution coming from \(\xi _\varepsilon \) defined below. We will show that the negative contribution is small when \(\varepsilon \) is small.

To localize the computations, fix a radially symmetric cut-off function

$$\begin{aligned} \eta (x) \in C_c ^\infty \left( B_{\frac{1}{2}}\right) \quad \text{ with } \quad \eta =1 \; \text{ on } \; B_{\frac{1}{4}}, \; 0\le \eta \le 1. \end{aligned}$$
(4.14)

Define

$$\begin{aligned} \tilde{\rho }_{(y,s)}(x,t):=\rho _{(y,s)}(x,t) \eta (x-y)=\frac{1}{(4\pi (s-t))^{\frac{n-1}{2}}} e^{-\frac{|x-y|^2}{4(s-t)}}\eta (x-y)\quad \end{aligned}$$
(4.15)

for \(t<s\) and \(x,y\in \Omega \) and define

$$\begin{aligned} e_\varepsilon :=\frac{\varepsilon |\nabla \varphi |^2}{2}+ \frac{W(\varphi )}{\varepsilon } ,\quad \xi _\varepsilon : =\frac{\varepsilon |\nabla \varphi |^2}{2}- \frac{W(\varphi )}{\varepsilon }. \end{aligned}$$
(4.16)

Proposition 4.1

Suppose that \(\varphi \) satisfies (4.2). With the notation of (4.4), (4.15), (4.16) and writing \({\tilde{\rho }}={\tilde{\rho }}_{(y,s)}(x,t)\), we have depending only on n such that

(4.17)

for \(y\in \Omega \), \(0<t<s<\infty \) and \(t<T\).

Proof

We define L as follows and by (4.2),

$$\begin{aligned} L:=\partial _t\varphi + u\cdot \nabla \varphi = \Delta \varphi -\frac{W^{\prime }(\varphi )}{\varepsilon ^2} .\end{aligned}$$

By integration by parts we have

$$\begin{aligned} \frac{d}{dt} \int _{\Omega } e_{\varepsilon } \tilde{\rho }\, dx&= \int _{\Omega }\{ e_\varepsilon \partial _t \tilde{\rho }-\varepsilon (L-u\cdot \nabla \varphi )(\nabla \tilde{\rho }\cdot \nabla \varphi +\tilde{\rho }L) \} \, dx \nonumber \\&= \int _{\Omega } \left\{ e_{\varepsilon } \partial _t\tilde{\rho }\!-\!\varepsilon \tilde{\rho }\left( L\!+\!\frac{\nabla \tilde{\rho }\cdot \nabla \varphi }{\tilde{\rho }} \right) ^2 \!+\!\varepsilon \left( L\nabla \tilde{\rho }\cdot \nabla \varphi \!+\!\frac{(\nabla \tilde{\rho }\cdot \nabla \varphi )^2}{\tilde{\rho }} \right) \right. \nonumber \\&\quad \left. +\,\varepsilon \tilde{\rho }u \cdot \nabla \varphi \left( L+\frac{\nabla \tilde{\rho }\cdot \nabla \varphi }{\tilde{\rho }} \right) \right\} \, dx \nonumber \\&\le \int _{\Omega } \left\{ e_\varepsilon \partial _t\tilde{\rho }+\varepsilon \left( L\nabla \tilde{\rho }\cdot \nabla \varphi +\frac{(\nabla \tilde{\rho }\cdot \nabla \varphi )^2}{\tilde{\rho }} \right) +\frac{1}{4} \varepsilon \tilde{\rho }(u\cdot \nabla \varphi )^2 \right\} \, dx .\nonumber \\ \end{aligned}$$
(4.18)

Moreover by integration by parts we obtain

$$\begin{aligned} \int _{\Omega } \varepsilon L \nabla \tilde{\rho }\cdot \nabla \varphi \, dx = \int _{\Omega } -\varepsilon (\nabla \varphi \otimes \nabla \varphi ) \cdot \nabla ^2 \tilde{\rho }+ e_{\varepsilon } \Delta \tilde{\rho }\, dx. \end{aligned}$$
(4.19)

Substitution of (4.19) into (4.18) gives

$$\begin{aligned} \frac{d}{dt}\int _{\Omega }e_{\varepsilon }\tilde{\rho }\, dx\le & {} \int _{\Omega } (-\xi _{\varepsilon })(\partial _t\tilde{\rho }+\Delta \tilde{\rho }) +\varepsilon |\nabla \varphi |^2\left( \partial _t \tilde{\rho }+\Delta \tilde{\rho }-\frac{\nabla \varphi \otimes \nabla \varphi }{|\nabla \varphi |^2} \cdot \nabla ^2 \tilde{\rho }\right. \nonumber \\&\left. + \frac{(\nabla \tilde{\rho }\cdot \nabla \varphi )^2}{\tilde{\rho }|\nabla \varphi |^2} \right) +\frac{1}{4} \varepsilon \tilde{\rho }(u\cdot \nabla \varphi )^2\, dx.\qquad \quad \end{aligned}$$
(4.20)

We remark that \(\rho \) (without multiplication by \(\eta \)) satisfies the following:

$$\begin{aligned} \partial _t \rho +\Delta \rho = -\frac{\rho }{2(s-t)} ,\quad \partial _t \rho +\Delta \rho -\frac{\nabla \varphi \otimes \nabla \varphi }{|\nabla \varphi |^2} \cdot \nabla ^2 \rho + \frac{(\nabla \rho \cdot \nabla \varphi )^2}{\rho |\nabla \varphi |^2} =0.\nonumber \\ \end{aligned}$$
(4.21)

When one computes (4.21) with \(\tilde{\rho }\) instead of \(\rho \), we have additional terms coming from differentiation of \(\eta \). The integration of these terms can be bounded by \(c \mu ^{\varepsilon } _t (B_{1/2}(y)) e^{-\frac{1}{128(s-t)}}\) for \(c=c(n)\) since \(|\nabla ^j\rho | \le c(j,n) e^{-\frac{1}{128(s-t)}}\) for any \(x,y\in \Omega \) with \(|x-y|>\frac{1}{4}\) and \(j=0,1\). Thus, with an appropriate choice of depending only on n, we obtain (4.17). \(\square \)

4.3 Some estimates on \(\Omega \times [0,T]\)

Lemma 4.1

Suppose that \(\varphi \) satisfies (4.2), (4.3), (4.6), (4.7) and (4.10). Then there exists depending only on such that

(4.22)

Proof

Take any domain \(B_{3\varepsilon }(x_0)\times [t_0,t_0+2\varepsilon ^2]\subset \Omega \times [0,T]\). Define \(\tilde{\varphi }(x,t):=\varphi (\varepsilon x+x_0,\varepsilon ^2 t+t_0)\) and \(\tilde{u} (x,t):=u (\varepsilon x+x_0,\varepsilon ^2 t+t_0)\) for \((x,t)\in B_3\times [0,2]\). By (4.2) we have

$$\begin{aligned} \partial _t \tilde{\varphi }+ \varepsilon \tilde{u} \cdot \nabla \tilde{\varphi }= \Delta \tilde{\varphi }- W^{\prime }(\tilde{\varphi }). \end{aligned}$$
(4.23)

Using the estimate of [32, p. 342, Theorem 9.1], if \(\partial _t v-\Delta v = f\) on \(B_2\times [0,2]\) then we have

$$\begin{aligned} \Vert \partial _t v, \nabla ^2 v \Vert _{L^r(B_1\times [j,2])} \le c(n,r) (\Vert f,\nabla v,v\Vert _{L^r (B_2 \times [0,2])}) \!+\!(1\!-\!j) \Vert v(\cdot ,0) \Vert _{W^{2,r}(B_2)})\nonumber \\ \end{aligned}$$
(4.24)

for \(j=0\) (up to \(t=0\)) or \(j=1\) (interior estimate) and for \(r\in (1,\infty )\). Let \(\phi \in C_c ^1 (B_3)\) be a cut-off function and multiply \(\phi ^2 \tilde{\varphi }\) to (4.23), then by integration by parts, (4.6), (4.7) and (4.10), we have

$$\begin{aligned} \int _0 ^2 \int _{B_2} |\nabla \tilde{\varphi }|^2 \,dxdt \le c(W). \end{aligned}$$
(4.25)

Hence by (4.6), (4.7), (4.10), (4.24) (\(r=2\)) and (4.25) we obtain

By applying (4.24) to the equation

$$\begin{aligned} \partial _t (\tilde{\varphi }_{x_i}) -\Delta \tilde{\varphi }_{x_i} =-\varepsilon \tilde{u} _{x_i} \cdot \nabla \tilde{\varphi }-\varepsilon \tilde{u} \cdot \nabla \tilde{\varphi }_{x_i} - W^{\prime \prime }(\tilde{\varphi })\tilde{\varphi }_{x_i}, \end{aligned}$$

and using (4.6), (4.7) and (4.10) again, we obtain

Therefore we obtain the \(W^{1,2}\) estimates of \(\nabla \tilde{\varphi }\) on \(B_1 \times [0,2]\), and by the Sobolev inequality we have

We can use this estimate to (4.23) and (4.24) with \(r=\frac{2(n+1)}{n-1}\). We repeat this argument until r is large enough so that \( W^{1,r}\subset C^{\frac{1}{2}}\) with appropriate modifications of the domain. Then we obtain the desired estimate

Since the domain was arbitrary, after returning to the original coordinate system, we obtain (4.22). \(\square \)

Lemma 4.2

There exists such that, if and under the assumptions of (4.1)–(4.3), (4.6), (4.7), (4.9) and (4.10), we have

$$\begin{aligned} \frac{\varepsilon |\nabla \varphi |^2 }{2}-\frac{W(\varphi )}{\varepsilon } \le 10\varepsilon ^{-\beta } \quad \text {on}\; \Omega \times [0,T]. \end{aligned}$$
(4.26)

Proof

Rescale the domain by \(x\mapsto \frac{x}{\varepsilon }\) and \(t\mapsto \frac{t}{\varepsilon ^2}\). Under the change of variables, we continue to use the same notations for \(\varphi \) and u. Define

$$\begin{aligned} \xi :=\frac{|\nabla \varphi |^2}{2} -W(\varphi )-G(\varphi ), \end{aligned}$$
(4.27)

where G will be chosen later. We compute \(\partial _t \xi +\varepsilon u\cdot \nabla \xi -\Delta \xi \) and obtain

$$\begin{aligned}&\partial _t \xi +\varepsilon u\cdot \nabla \xi -\Delta \xi \nonumber \\&\quad = \nabla \varphi \cdot \nabla \partial _t \varphi -(W^{\prime }+G^{\prime }) \partial _t\varphi +\varepsilon (u\otimes \nabla \varphi )\cdot \nabla ^2 \varphi -\varepsilon (W^{\prime }+G^{\prime })u\cdot \nabla \varphi \nonumber \\&\quad \quad -|\nabla ^2 \varphi | ^2 -\nabla \varphi \cdot \nabla (\Delta \varphi ) +(W^{\prime } + G^{\prime }) \Delta \varphi +(W^{\prime \prime } +G^{\prime \prime }) |\nabla \varphi |^2 . \end{aligned}$$
(4.28)

Here, we denoted and will denote \(W^{\prime }(\varphi )\) as \(W^{\prime }\), \(G(\varphi )\) as G and so forth for simplicity. Differentiate (4.23) with respect to \(x_j\), multiply \(\varphi _{x_j}\) and sum over j to obtain

$$\begin{aligned}&\nabla \varphi \cdot \nabla \partial _t\varphi +\varepsilon \nabla u \cdot (\nabla \varphi \otimes \nabla \varphi )+\varepsilon (u\otimes \nabla \varphi ) \cdot \nabla ^2 \varphi \nonumber \\&\quad =\nabla \varphi \cdot \nabla (\Delta \varphi ) -W^{\prime \prime } |\nabla \varphi | ^2. \end{aligned}$$
(4.29)

By (4.23), (4.28) and (4.29) we have

$$\begin{aligned}&\partial _t \xi +\varepsilon u\cdot \nabla \xi -\Delta \xi = W^{\prime } (W^{\prime }+G^{\prime }) -|\nabla ^2 \varphi |^2\nonumber \\&\quad -\varepsilon \nabla u \cdot ( \nabla \varphi \otimes \nabla \varphi ) +G^{\prime \prime } |\nabla \varphi |^2. \end{aligned}$$
(4.30)

Differentiating (4.27) with respect to \(x_j\) and by using the Cauchy-Schwarz inequality we have

$$\begin{aligned} \sum _{j=1} ^n \left( \sum _{i=1} ^n \varphi _{x_i} \varphi _{x_i x_j} \right) ^2&= \sum _{j=1} ^n \left( \xi _{x_j} +(W^{\prime }+G^{\prime })\varphi _{x_j} \right) ^2 \nonumber \\&=|\nabla \xi | ^2 +2(W^{\prime }+G^{\prime }) \nabla \xi \cdot \nabla \varphi + (W^{\prime } +G^{\prime })^2 |\nabla \varphi | ^2\nonumber \\&\le |\nabla \varphi |^2 |\nabla ^2 \varphi |^2. \end{aligned}$$
(4.31)

On \(\{|\nabla \varphi |>0\}\), divide (4.31) by \(|\nabla \varphi |^2\) and substitute into (4.30) to obtain

$$\begin{aligned}&\partial _t \xi +\varepsilon u \cdot \nabla \xi -\Delta \xi \nonumber \\&\quad \le W^{\prime } (W^{\prime }+G^{\prime } ) -\frac{1}{|\nabla \varphi |^2} ( |\nabla \xi |^2 +2(W^{\prime }+G^{\prime }) \nabla \xi \cdot \nabla \varphi + (W^{\prime }+G^{\prime })^2 |\nabla \varphi |^2 ) \nonumber \\&\quad \quad -\,\varepsilon \nabla u \cdot ( \nabla \varphi \otimes \nabla \varphi )+ G^{\prime \prime } |\nabla \varphi |^2 \nonumber \\&\quad \le -(G^{\prime }) ^2 -W^{\prime }G^{\prime } -\frac{2(W^{\prime }+G^{\prime })}{|\nabla \varphi |^2} \nabla \xi \cdot \nabla \varphi -\varepsilon \nabla u \cdot ( \nabla \varphi \otimes \nabla \varphi )\nonumber \\&\quad \quad +\,G^{\prime \prime }|\nabla \varphi |^2 . \end{aligned}$$
(4.32)

By \(|\nabla \varphi |^2 = 2(\xi +W+G)\) and (4.32) we have on \(\{|\nabla \varphi |>0\}\)

$$\begin{aligned} \partial _t \xi +\varepsilon u\cdot \nabla \xi -\Delta \xi\le & {} -(G^{\prime })^2 -W^{\prime }G^{\prime } +2G^{\prime \prime } (\xi +W+G) \nonumber \\&-\frac{2(W^{\prime }+G^{\prime })}{|\nabla \varphi |^2} \nabla \xi \cdot \nabla \varphi -\varepsilon \nabla u \cdot ( \nabla \varphi \otimes \nabla \varphi ). \end{aligned}$$
(4.33)

Let \(\phi (x,t)=\phi (x) \in C^\infty (B_{3\varepsilon ^{-1}})\) be such that

$$\begin{aligned} \phi =\left\{ \begin{array}{l} M:=\sup _{\mathbb {R}^n \times [0,\varepsilon ^{-2} T]} \left( \frac{|\nabla \varphi |^2}{2} -W(\varphi ) \right) \quad \text {on}\, \ B_{3\varepsilon ^{-1}}{\setminus } B_{2\varepsilon ^{-1}}, \\ 0 \quad \text {on} \ B_{\varepsilon ^{-1}},\end{array}\right. \end{aligned}$$

and

$$\begin{aligned} 0\le \phi \le M, \ |\nabla \phi |\le 2\varepsilon M , \ |\Delta \phi | \le 2n\varepsilon ^2 M. \end{aligned}$$

Note that M may be bounded depending only on by Lemma 4.1. Note also that we may assume \(M>0\) since \(M\le 0\) implies our conclusion (4.26) immediately. Let

$$\begin{aligned} \tilde{\xi }:=\xi -\phi \quad \text {and} \quad \ G(\varphi ):=\varepsilon ^{\frac{1}{2}} \left( 1-\frac{1}{8} (\varphi -\gamma ) ^2 \right) , \end{aligned}$$

where \(\gamma \) is as in (3.2). To derive a contradiction, suppose that

$$\begin{aligned} \sup _{B_{\varepsilon ^{-1}}\times [0, \varepsilon ^{-2} T]} \xi \ge \varepsilon ^{\frac{1}{2}} .\end{aligned}$$

Since \(\tilde{\xi }\le 0\) on \( (B_{3\varepsilon ^{-1}}{\setminus } B_{2\varepsilon ^{-1}})\times [0, \varepsilon ^{-2} T]\), \(\tilde{\xi }\le \varepsilon ^{1-\beta } \) on \(B_{3\varepsilon ^{-1}} \times \{ 0 \}\) by (4.9) and \(\sup _{B_{\varepsilon ^{-1}} \times [0, \varepsilon ^{-2} T]} \tilde{\xi }\ge \varepsilon ^{\frac{1}{2}}\), there exists some interior maximum point \((x_0,t_0)\) of \(\tilde{\xi }\) where

$$\begin{aligned} \partial _t \tilde{\xi }\ge 0, \ \nabla \tilde{\xi }=0, \ \Delta \tilde{\xi }\le 0 \quad \text {and}\quad \tilde{\xi }\ge \varepsilon ^{\frac{1}{2}}\end{aligned}$$

hold. By the definition of \(\phi \) we have at the point \((x_0,t_0)\)

$$\begin{aligned} \partial _t \xi \ge 0, \ |\nabla \xi |\le 2\varepsilon M, \ \Delta \xi \le 2n\varepsilon ^2 M \quad \text {and} \quad |\nabla \varphi |^2 \ge 2\varepsilon ^{\frac{1}{2}}. \end{aligned}$$
(4.34)

Substitute (4.34) into (4.33). Using \(\varepsilon \nabla u \cdot ( \nabla \varphi \otimes \nabla \varphi )\le 2\varepsilon |\nabla u|(\xi +W+G)\) and (4.10), we have

$$\begin{aligned} 0\le & {} 2n\varepsilon ^2 M -(G^{\prime })^2 -W^{\prime } G^{\prime } + 2G^{\prime \prime }(\xi +W+G) +\frac{4(|W^{\prime }| + |G^{\prime }| )\varepsilon M}{\left( 2\varepsilon ^{\frac{1}{2}}\right) ^{\frac{1}{2}}} \nonumber \\&+ 2\varepsilon ^{1- \beta } (\xi +W+G)+2 \varepsilon ^{2-\beta }M. \end{aligned}$$
(4.35)

Since \(\beta <\frac{1}{2}\) and \(G^{\prime \prime }=-\varepsilon ^{\frac{1}{2}}/4\), for sufficient small \(\varepsilon \) depending only on \(\beta \) and W,

$$\begin{aligned} 2G^{\prime \prime } (\xi +W+G) + 2 \varepsilon ^{1-\beta } (\xi +W+G) \le G^{\prime \prime } (W+G). \end{aligned}$$
(4.36)

If \(|\varphi (x_0,t_0)|\le \alpha \), then

$$\begin{aligned} G^{\prime \prime }(\varphi (x_0,t_0)) W(\varphi (x_0,t_0))\le -\frac{\varepsilon ^{\frac{1}{2}}}{4} \min _{|z| \le \alpha } W(z), \end{aligned}$$

which is a ‘big’ negative number compared to the rest, and one can check that this and (4.36) (as well as \(W^{\prime }G^{\prime }\ge 0\) and \(G>0\)) lead to a contradiction in (4.35). If \(|\varphi (x_0,t_0)|\ge \alpha \), then we would have ‘big’ negative contributions coming from (all evaluated at \((x_0,t_0)\))

$$\begin{aligned} (G^{\prime }) ^2 \ge \frac{\varepsilon (\alpha -|\gamma |)^2}{64} \quad \text {and} \quad -W^{\prime }G^{\prime }\le -\frac{\varepsilon ^{\frac{1}{2}}(\alpha -|\gamma |)}{4} |W^{\prime }|, \end{aligned}$$

which again lead to a contradiction in (4.35) for sufficiently small \(\varepsilon \). This shows that

$$\begin{aligned} \sup _{B_{\varepsilon ^{-1}}\times [0, \varepsilon ^{-2} T]} \left( \frac{|\nabla \varphi |^2}{2} -W(\varphi ) \right) \le 2\varepsilon ^{\frac{1}{2}} ,\end{aligned}$$

where \(G\le \varepsilon ^{\frac{1}{2}}\) is used. Now repeat the same argument, this time with M replaced by \(2\varepsilon ^{\frac{1}{2}}\) and G replaced by \(8\varepsilon ^{1-\beta } (1-\frac{1}{8} (\varphi -\gamma ) ^2)\). If we assume

$$\begin{aligned} \sup _{B_{\varepsilon ^{-1}}\times [0,\varepsilon ^{-2} T]} \xi \ge 2\varepsilon ^{1-\beta },\end{aligned}$$

\(\tilde{\xi }= \xi -\phi \) would attain some interior maximum in \(B_{3\varepsilon ^{-1}}\times [0,\varepsilon ^{-2} T]\) by (4.9) and by the subtraction of \(\phi \). This time we would have \(\partial _t \xi \ge 0, \ |\nabla \xi |\le 4\varepsilon ^{\frac{3}{2}}, \ \Delta \xi \le 4n\varepsilon ^{\frac{5}{2}}\) and \(|\nabla \varphi | ^2 \ge 4\varepsilon ^{1-\beta }\). With this (4.35) is

$$\begin{aligned} 0\le & {} 4n\varepsilon ^{\frac{5}{2}} -(G^{\prime })^2 -W^{\prime } G^{\prime } + 2G^{\prime \prime }(\xi +W+G) +\frac{8(|W^{\prime }| + |G^{\prime }| )\varepsilon ^{\frac{3}{2}} }{(4\varepsilon ^{1-\beta })^{\frac{1}{2}}} \\&+ 2\varepsilon ^{1- \beta } (\xi +W+G)+4 \varepsilon ^{\frac{5}{2}-\beta }. \end{aligned}$$

Exactly the same type of argument as before shows that we have a contradiction, and since \(G\le 8 \varepsilon ^{1- \beta } \) and \(\xi -G\le 2\varepsilon ^{1-\beta }\), we have (4.26). \(\square \)

Lemma 4.3

Let \(\mu _s^{\varepsilon }\), D(t) and \({\tilde{\rho }}_{(y,s)}\) be defined as in (4.4), (4.5) and (4.15). Let sRr be positive with \(0\le s-(\frac{R}{r})^2\le T\) and \(R\in (0,\frac{1}{2})\). Set \(\tilde{s}= s-(\frac{R}{r})^2\). Then there exists such that, for any \(y\in \Omega \), we have

Proof

First, on \(B_R (y)\) we compute

$$\begin{aligned} \int _{B_R(y)} \tilde{\rho }_{(y,s)}(x,\tilde{s}) \, d\mu ^{\varepsilon } _{\tilde{s}}&\le \left( \frac{r}{\sqrt{4\pi }R} \right) ^{n-1}\int _{B_R(y)} e^{-\frac{r^2 |x-y|^2}{4R^2}} \, d\mu ^{\varepsilon } _{\tilde{s}} \\&\le \left( \frac{r}{\sqrt{4\pi }R} \right) ^{n-1}\mu ^{\varepsilon }_{\tilde{s}}(B_R (y)). \end{aligned}$$

On \(\Omega {\setminus } B_R(y)\) we have

$$\begin{aligned}&\left( \frac{\sqrt{4\pi }R}{r} \right) ^{n-1} \int _{\Omega {\setminus } B_R(y)} \tilde{\rho }_{(y,s)}(x,\tilde{s}) \, d\mu ^{\varepsilon } _{\tilde{s}} \le \int _{B_{\frac{1}{2}} (y){\setminus } B_R(y)} e^{-\frac{r^2 |x-y|^2}{4R^2}} \, d\mu ^{\varepsilon } _{\tilde{s}} \nonumber \\&\quad \le \int _0 ^1 \mu ^{\varepsilon } _{\tilde{s}} \left( \left( B_{\frac{1}{2}} (y){\setminus } B_R(y)\right) \cap \left\{ x \ | \ e^{-\frac{r^2 |x-y|^2}{4R^2}}\ge \lambda \right\} \right) \, d\lambda \nonumber \\&\quad \le \int _0 ^{\exp ( -\frac{r^2}{16R^2}) } \mu ^{\varepsilon } _{\tilde{s}} \left( B_{\frac{1}{2}} (y){\setminus } B_R(y) \right) \, d\lambda + \int _{\exp ( -\frac{r^2}{16R^2}) } ^{\exp ( -\frac{r^2}{4}) } \mu ^{\varepsilon } _{\tilde{s}} ( B_{\frac{2R}{r} \sqrt{\log \lambda ^{-1}}} (y) ) \, d\lambda \nonumber \\&\quad \le \mu ^{\varepsilon } _{\tilde{s}}\left( B_{\frac{1}{2}}(y)\right) e^{-\frac{r^2}{16R^2} } + D(\tilde{s}) \omega _{n-1} \left( \frac{2R}{r} \right) ^{n-1} \int _{\frac{r^2}{4}} ^{\frac{r^2}{16R^2}}l^{\frac{n-1}{2}} e^{-l} \, dl \nonumber \\&\quad \le \mu ^{\varepsilon } _{\tilde{s}}\left( B_{\frac{1}{2}}(y)\right) e^{-\frac{r^2}{16R^2} }+ c(n) D(\tilde{s})\left( \frac{2R}{r} \right) ^{n-1} e^{-\frac{r^2}{8}}. \end{aligned}$$
(4.37)

Here we used the fact that there exists \(c=c(n)>0\) such that \(l^{\frac{n-1}{2}} e^{-l}\le c e^{-\frac{l}{2}}\) for any \(l>0\). \(\square \)

4.4 Proof of Theorem 4.1

In this subsection, we always work under the assumptions of Theorem 4.1. In particular, results from the two preceding subsections are available. Furthermore, from now on until Proposition 4.2, we assume

$$\begin{aligned} D(t)\le D_1 \end{aligned}$$
(4.38)

holds for \(t\in [0,T_1]\) and \(T_1\le T\). Here, \(D_1\ge 2D_0\) is a constant depending only on , and not on \(\varepsilon \), and which will be determined after Proposition 4.2. We need to be careful about the dependence of constants so that we do not end up a circular argument. Any constant depending on \(D_1\) will be again a constant depending on . Note that such \(T_1>0\) exists because \(D_1>D_0\) and by the continuity of D(t) in time. Such continuity follows from that of \(\varphi \) in the case of \(\Omega ={\mathbb T}^n\), and additionally from (4.8) in the case of \(\Omega ={\mathbb R}^n\). \(T_1\) may depend on \(\varepsilon \) in general, but in the end, we prove that \(T_1=T\) as long as \(\varepsilon \) is sufficiently small. First, under this assumption we have the following a-priori estimate:

Lemma 4.4

There exists depending only on npq such that for any \(0\le t_0<t_1\) we have

(4.39)

In particular, there exists \(E_0\) depending only on such that

$$\begin{aligned} \sup _{t\in [0,T_1]} \mu _t^{\varepsilon }(\Omega )+\frac{1}{2}\int _0^{T_1}\int _{\Omega }\varepsilon \left( \Delta \varphi -\frac{W^{\prime }(\varphi )}{\varepsilon ^2} \right) ^2 \,dxdt\le E_0. \end{aligned}$$
(4.40)

Proof

By (4.2) we can compute

$$\begin{aligned} \frac{d}{dt} \mu ^{\varepsilon } _t (\Omega ) \le - \frac{1}{2} \int _\Omega \varepsilon \left( \Delta \varphi -\frac{W^{\prime }(\varphi )}{\varepsilon ^2} \right) ^2 \, dx +\varepsilon \int _\Omega (u\cdot \nabla \varphi )^2 \,dx. \end{aligned}$$
(4.41)

To estimate the last term of (4.41), we consider two cases \(p<2\) and \(p\ge 2\) separately. In addition we consider \(\Omega ={\mathbb T}^n,\,{\mathbb R}^n\) separately, and let us consider \({\mathbb T}^n\) first. Let \(\{\psi _{\alpha }\}_{\alpha }\) be a partition of unity on \(\Omega \) such that \(\psi _{\alpha }\in C^{\infty }_c(\Omega )\), \(\mathrm{diam}\,(\mathrm{spt}\,\psi _{\alpha })\le 1/2\) and \(\Vert \psi _{\alpha }\Vert _{C^2}\le c(n)\). Consider \(p<2\) case first. Just as in (2.12), by setting \(s:= \frac{p(n-1)}{n-p}\ge 2\), we have

$$\begin{aligned} \varepsilon \int _{\Omega }(u\cdot \nabla \varphi )^2\, dx\le & {} \left( \int _{\Omega } |u|^s\varepsilon |\nabla \varphi |^2\, dx\right) ^{\frac{2}{s}} (2\mu _t^{\varepsilon }(\Omega ))^{1-\frac{2}{s}} \nonumber \\\le & {} \left( \sum _{\alpha }c(n,p)\int _{\Omega } |\psi _{\alpha } u|^s\varepsilon |\nabla \varphi |^2\, dx\right) ^{\frac{2}{s}} (2D(t))^{1-\frac{2}{s}} \nonumber \\\le & {} \left( \sum _{\alpha } c(n,p)D(t)\left( \int _{\mathrm{spt}\, \psi _{\alpha }}|u|^p+|\nabla u|^p\, dx\right) ^{\frac{s}{p}}\right) ^{\frac{2}{s}} (2D(t))^{1-\frac{2}{s}}\nonumber \\\le & {} c(n,p) D(t) \Vert u(\cdot ,t)\Vert _{W^{1,p}(\Omega )}^2 \end{aligned}$$
(4.42)

where each constant is different. We used the local finiteness of \(\{\psi _{\alpha }\}_{\alpha }\) and \(\sum _{\alpha } A_{\alpha }^{\frac{s}{p}}\le (\sum _{\alpha } A_{\alpha })^{\frac{s}{p}}\) since \(\frac{s}{p}\ge 1\). For \(p\ge 2\), we have

$$\begin{aligned} \varepsilon \int _{\Omega }(u\cdot \nabla \varphi )^2\, dx\le & {} \left( \int _{\Omega } |u|^p\varepsilon |\nabla \varphi |^2\, dx\right) ^{\frac{2}{p}} (2\mu _t^{\varepsilon }(\Omega ))^{1-\frac{2}{p}}\nonumber \\\le & {} \left( \sum _{\alpha } c(n,p)\int _{\Omega } |\psi _{\alpha } u|^p\, \varepsilon |\nabla \varphi |^2\, dx\right) ^{\frac{2}{p}} (2D(t))^{1-\frac{2}{p}} \nonumber \\\le & {} \left( \sum _{\alpha } c(n,p) D(t) \int _{\mathrm{spt}\,\psi _{\alpha }} |u|^p+|u|^{p-1}|\nabla u|\, dx\right) ^{\frac{2}{p}} (2D(t))^{1-\frac{2}{p}} \nonumber \\\le & {} c(n,p)D(t)\Vert u(\cdot ,t)\Vert _{W^{1,p}(\Omega )}^2. \end{aligned}$$
(4.43)

Here we used (2.11) with \(p=1\) there and \(\phi =|\psi _{\alpha } u|^p\). Integration of (4.39) over \([t_0,t_1]\) using (4.42) or (4.43) gives (4.39). We define \(E_0\) to be . In case of \(\Omega ={\mathbb R}^n\), we do not need to take the partition of unity and the proof proceeds similarly. \(\square \)

In the following we define \(\beta ^{\prime }\) by

$$\begin{aligned} \beta ^{\prime }:=\frac{1+\beta }{2}. \end{aligned}$$

In fact, any number \(\beta ^{\prime }\in (\beta ,1)\) can be used. To fix the idea, we specify such \(\beta ^{\prime }\), and suppose that \(\beta ^{\prime }\) depends on \(\beta \) for simplicity.

Lemma 4.5

There exist , and with depending only on and \(D_0\) with the following property. Assume and \(|\varphi (y,s)|\le \alpha <1\) with \(s\in (0,T_1]\). Here \(\alpha \) is from (3.3). Then for any \(t\in [0,T_1 ]\) with \(\max \{ 0,s-2\varepsilon ^{2\beta \prime } \} \le t \le s \) we have

(4.44)

where .

Proof

We will choose and assume for the moment that . Set \(\tilde{\rho }= \tilde{\rho }_{(y,s+\varepsilon ^2)} (x,t)\) in this proof. Assume \(|\varphi (y,s)| \le \alpha <1 \). We have

$$\begin{aligned} \int _{\Omega } \tilde{\rho }\, d\mu ^{\varepsilon } _s (x) = \int _{\varepsilon ^{-1}\Omega } \frac{e^{-\frac{|\tilde{x}|^2}{4}}}{(\sqrt{4\pi } )^{n-1}} \eta (\varepsilon \tilde{x}) \left( \frac{|\nabla \tilde{\varphi }|^2 }{2} +W(\tilde{\varphi }) \right) \, d\tilde{x}, \end{aligned}$$

where \(\tilde{\varphi }(\tilde{x},s ) = \varphi (\varepsilon \tilde{x} +y ,s)\). By \(|\tilde{\varphi }(0,s)| \le \alpha <1\) and Lemma 4.1 there exists such that

(4.45)

From (4.10), (4.17), (4.26), (4.40) and we have for \(\lambda \in [t,s )\)

(4.46)

Here \(\int _{\Omega } \tilde{\rho }\, dx\le \sqrt{4\pi (s-t)}\) is used. Multiply (4.46) by \(e^{ \varepsilon ^{-2 \beta }(s-\lambda )}\) and integrate over [ts]. By \(t\ge \max \{0,s-2\varepsilon ^{2\beta ^{\prime }}\}\) we have

(4.47)

By (4.45) and (4.47) for sufficiently small \(\varepsilon \) depending only on \(D_1,\beta ,n\) and we have

(4.48)

Next we use Lemma with , where we may assume that and . We chose this r so that

(4.49)

In Lemma , we replace s and \(s-(\frac{R}{r})^2\) by \(s+\varepsilon ^2\) and t respectively. Remark that \(R:= r (s+\varepsilon ^2 -t )^{\frac{1}{2}}\le r(\varepsilon ^2 +2 \varepsilon ^{2\beta \prime })^{\frac{1}{2}}\) since \(s-t\le 2\varepsilon ^{2\beta \prime }\). Hence we have \(R< \frac{1}{2}\) by restricting \(\varepsilon \) depending only on and . From (4.38), (4.40) and Lemma we have

(4.50)

Note that \(r/R\ge \varepsilon ^{-\beta ^{\prime }}/\sqrt{3}\). By (4.50), (4.48) and (4.49) for sufficiently small \(\varepsilon \) we obtain

Set and and we have the desired estimate (4.44). Note that the restriction on \(\varepsilon \) depends on , , \(D_1\), . Examining the dependence, we may conclude the proof. \(\square \)

Lemma 4.6

There exists and depending only on n, , , p, q, T, W, \(\beta \) and \(D_0\) with the following property. For any \(r\in (\varepsilon ^{\beta \prime } , \frac{1}{2})\) and \(t\in [2\varepsilon ^{2\beta \prime } ,T]\cap [0,T_1]\), we have

(4.51)

provided .

Proof

We only need to prove the claim when \(T_1\ge 2\varepsilon ^{2\beta \prime }\) since the claim is vacuously true otherwise. Let \(y\in \Omega \), \(r\in (\varepsilon ^{\beta \prime },\frac{1}{2})\) and \(t_*\in [2\varepsilon ^{2\beta \prime },T]\cap [0,T_1 ]\) be arbitrary and fixed. We define

$$\begin{aligned} \tilde{A}:= \left\{ x\in B_{2r} (y) \, : \, \text {for some} \ \tilde{t} \ \text {with} \ t_*-\varepsilon ^{2\beta \prime } \le \tilde{t} \le t_*, \ |\varphi (x,\tilde{t}) |\le \alpha \right\} , \end{aligned}$$

By Vitali’s covering theorem applied to , there exists a set of pairwise disjoint balls such that

(4.52)

By the definition of \(\tilde{A}\), for each \(x_i\) there exists \( \tilde{t}_i\) such that

$$\begin{aligned} t_*-\varepsilon ^{2\beta \prime } \le \tilde{t}_i \le t_*, \ \ | \varphi (x_i,\tilde{t}_i)| \le \alpha . \end{aligned}$$
(4.53)

Define \(\hat{t}:=t_*-2\varepsilon ^{2\beta \prime }\). Since \(t_*\ge 2\varepsilon ^{2\beta \prime }\), we have \(\hat{t}\ge 0\). By (4.53),

$$\begin{aligned} \varepsilon ^{2 \beta \prime } \le \tilde{t}_i -\hat{t}\le 2\varepsilon ^{2\beta \prime } \end{aligned}$$
(4.54)

and the assumption of Lemma 4.5 is satisfied for \(s=\tilde{t}_i , \ y= x_i , \ t=\hat{t}\) and if . Hence we may conclude that

(4.55)

By (4.54), we have , which shows

(4.56)

from (4.55) with . Since are pairwise disjoint and , (4.56) gives

(4.57)

Hence the n-dimensional volume of A is estimated by (4.52) and (4.57)

By (4.38) and \(r\ge \varepsilon ^{\beta \prime }\),

(4.58)

where . Hence by (4.26) and (4.58)

(4.59)

Next we estimate the surface energy on the complement of A which decays very quickly. Define \(\phi \in \text {Lip} (B_{2r} (y)) \) such that

$$\begin{aligned}&\phi (x):=\left\{ \begin{array}{l} 1 \quad \text {if}~\, x\in B_r (y){\setminus } A, \\ 0 \quad \text {if}~\, \mathrm{dist}(\mathrm{x},\mathrm{B}_\mathrm{r} (\mathrm{y}){\setminus } \mathrm{A}) \ge \varepsilon ^{\beta \prime },\end{array} \right. \\ \end{aligned}$$
$$\begin{aligned} |\nabla \phi |\le 2\varepsilon ^{ -\beta ^{\prime }} \quad \text {and} \quad 0\le \phi \le 1.\end{aligned}$$

By \(r\ge \varepsilon ^{\beta \prime }\), and the definitions of \(\tilde{A}\) and \(\phi \), we have \({\mathrm{spt}} \phi \cap {\tilde{A}} =\emptyset \), hence

$$\begin{aligned} |\varphi (x,s)|\ge \alpha , \quad \text {for} \ x\in \mathrm{spt}\phi , \ \mathrm{s}\in [\mathrm{t}_*-\varepsilon ^{2\beta \prime },\mathrm{t}_*]. \end{aligned}$$
(4.60)

For each j differentiate the Eq. (4.2) with respect to \(x_j\), multiply \(\phi ^2 \frac{\partial \varphi }{\partial x_j}\), sum over j and integrate to obtain

$$\begin{aligned}&\frac{d}{dt} \int _{\Omega } \frac{1}{2} |\nabla \varphi |^2 \phi ^2 \, dx + \int _{\Omega } (u\otimes \nabla \varphi \cdot \nabla ^2 \varphi + \nabla \varphi \otimes \nabla \varphi \cdot \nabla u) \phi ^2 \, dx \nonumber \\&\quad =\int _\Omega \left( \nabla \varphi \cdot \Delta \nabla \varphi -\frac{W^{\prime \prime }(\varphi )}{\varepsilon ^2} |\nabla \varphi |^2 \right) \phi ^2 \, dx. \end{aligned}$$
(4.61)

By integration by parts and the Cauchy-Schwarz inequality (4.61) gives

$$\begin{aligned} \frac{d}{dt} \int _{\Omega } \frac{1}{2} |\nabla \varphi |^2 \phi ^2 \, dx&\le \frac{1}{2} \int _{\Omega } |u| ^2 |\nabla \varphi |^2 \phi ^2 \, dx + \int _\Omega |\nabla \varphi |^2 |\nabla u| \phi ^2 \, dx \nonumber \\&\quad +\, 4\int _\Omega |\nabla \phi | ^2 |\nabla \varphi |^2 \, dx -\int _\Omega \frac{W^{\prime \prime } (\varphi )}{\varepsilon ^2} |\nabla \varphi |^2 \phi ^2 \, dx.\qquad \quad \end{aligned}$$
(4.62)

By (4.60), \(W^{\prime \prime }(\varphi )\ge \kappa \) on \(\mathrm{spt}\phi \) for \(t\in [t_*-\varepsilon ^{2\beta \prime } ,t_*]\). By (4.10) and the definition of \(\phi \), (4.62) gives

$$\begin{aligned} \frac{d}{dt} \int _{\Omega } \frac{1}{2} |\nabla \varphi |^2 \phi ^2 \, dx\le & {} \int _{\Omega } \left( \frac{\varepsilon ^{-2\beta }}{2} + \varepsilon ^{-1-\beta }\right) |\nabla \varphi |^2 \phi ^2 \, dx + 16 \varepsilon ^{-2\beta ^{\prime }} \int _{\mathrm{spt}\phi }|\nabla \varphi |^2 \, dx \nonumber \\&-\frac{\kappa }{\varepsilon ^2}\int _\Omega |\nabla \varphi |^2 \phi ^2 \, dx \nonumber \\\le & {} -\frac{\kappa }{2\varepsilon ^2} \int _\Omega |\nabla \varphi |^2 \phi ^2 \, dx +16 \varepsilon ^{-2\beta ^{\prime } } \int _{\mathrm{spt}\phi } |\nabla \varphi | ^2 \,dx \end{aligned}$$
(4.63)

for small \(\varepsilon \). By integrating (4.63) over \([t_*-\varepsilon ^{2\beta \prime },t_*]\), we obtain

$$\begin{aligned} \int _{\Omega } \frac{1}{2} |\nabla \varphi |^2 \phi ^2 (x,t_*)\, dx\le & {} e^{-\kappa \varepsilon ^{2(\beta ^{\prime }-1)} } \int _{\Omega } \frac{1}{2} |\nabla \varphi |^2 \phi ^2 (x,t_*- \varepsilon ^{2\beta \prime }) \, dx \nonumber \\&+ \int _{t_*-\varepsilon ^{2\beta \prime }} ^{t_*} e^{-\frac{\kappa }{\varepsilon ^{2}}(t_*-\lambda )} 16 \varepsilon ^{-2 \beta ^{\prime }} \left( \int _{\mathrm{spt}\phi } |\nabla \varphi |^2(x,\lambda ) \, dx \right) \, d\lambda .\nonumber \\ \end{aligned}$$
(4.64)

Define

$$\begin{aligned} M: = \sup _{\lambda \in [t_*-\varepsilon ^{2\beta \prime },t_*]} \int _{\mathrm{spt}\phi } \frac{1}{2} |\nabla \varphi |^2 (x,\lambda ) \, dx. \end{aligned}$$

By (4.64) we have

$$\begin{aligned} \int _\Omega \frac{1}{2} |\nabla \varphi |^2 \phi ^2 (x,t_*) \, dx \le \left( e^{-\kappa \varepsilon ^{2(\beta ^{\prime }-1)}} +32\kappa ^{-1} \varepsilon ^{2-2\beta ^{\prime }}\right) M. \end{aligned}$$
(4.65)

By \(\mathrm{spt}\phi \subset \mathrm{B}_{\mathrm{2r}}(\mathrm{y})\) and (4.38)

$$\begin{aligned} \varepsilon M\le \omega _{n-1} D_1 (2r)^{n-1} . \end{aligned}$$
(4.66)

Since \(B_r (y){\setminus } A \subset \{ \phi =1 \} \), we have

$$\begin{aligned} \int _{B_r(y) {\setminus } A} \frac{\varepsilon }{2} |\nabla \varphi |^2 (x,t_*) \, dx \le \int _{\Omega } \frac{\varepsilon }{2} |\nabla \varphi |^2 (x,t_*) \phi ^2 \, dx. \end{aligned}$$
(4.67)

Recall that \(\beta ^{\prime }<1\). By (4.65)–(4.67), we obtain for sufficiently small \(\varepsilon \) (depending only on \(\kappa \))

$$\begin{aligned} \int _{B_r(y) {\setminus } A} \frac{\varepsilon }{2} |\nabla \varphi |^2 (x,t_*) \, dx \le 33\kappa ^{-1} \varepsilon ^{2-2\beta ^{\prime }} D_1 \omega _{n-1} (2r)^{n-1}. \end{aligned}$$
(4.68)

By (4.59) and (4.68), and since \(\beta ^{\prime }-\beta =\frac{1-\beta }{2} < 2-2\beta ^{\prime }=1-\beta \), we obtain (4.51) with an appropriate choice of . \(\square \)

Later in Sect. 7, we use the following estimate which follows from Lemma 4.6.

Corollary 4.1

For any \(0<r<\frac{1}{2}\), and \(t\in [2\varepsilon ^{2\beta \prime },T]\cap [0,T_1]\), we have

(4.69)

Proof

For the integration over the range \(\tau \in (0,\varepsilon ^{\beta \prime })\), we simply use the estimate (4.26). For the range \(\tau \in (\varepsilon ^{\beta \prime },r)\), we use (4.51). \(\square \)

Lemma 4.7

There exists a constant depending only on n, , , p, q, T, \(D_0\), W, \(\beta \) such that for , \(t\in [0,T_1 ]\) and \(t<s\), we have

(4.70)

Proof

If \(t\le 2\varepsilon ^{2\beta '}\) then by using (4.26) and \(\int \rho \, dx =\sqrt{4\pi (s-\lambda )}\) we have

$$\begin{aligned}&\int _0 ^t \Big \{ \frac{1}{2(s-\lambda )} \int _\Omega \left( \frac{\varepsilon |\nabla \varphi | ^2}{2}-\frac{W(\varphi )}{\varepsilon } \right) _{+} \tilde{\rho }_{(y,s)} (x,\lambda ) \, dx \Big \} \, d\lambda \nonumber \\&\quad \le \int _0 ^t \frac{10 \varepsilon ^{-\beta } \sqrt{\pi }}{\sqrt{s-\lambda } } \, d\lambda \le 20\sqrt{2\pi } \varepsilon ^{\beta '- \beta }. \end{aligned}$$
(4.71)

By the similar argument, if \(s>t\ge s-2\varepsilon ^{2\beta '}\) then we have

$$\begin{aligned}&\int _{s-2\varepsilon ^{2\beta '}} ^t \left\{ \frac{1}{2(s-\lambda )} \int _\Omega \left( \frac{\varepsilon |\nabla \varphi | ^2}{2}-\frac{W(\varphi )}{\varepsilon } \right) _{+} \tilde{\rho }_{(y,s)} (x,\lambda ) \, dx \right\} \, d\lambda \nonumber \\&\quad \le 20 \sqrt{2\pi } \varepsilon ^{\beta '-\beta }. \end{aligned}$$
(4.72)

Hence we only need to estimate integral over \([2\varepsilon ^{2\beta '},t]\) with \(t\le s-2\varepsilon ^{2\beta '}\). First we estimate on \(B_{\varepsilon ^{\beta '}} (y)\). We compute using (4.26) and \(s-t\ge 2\varepsilon ^{2\beta '}\) that

$$\begin{aligned}&\int _{2\varepsilon ^{2\beta '}} ^t \frac{1}{2(s-\lambda )} \int _{B_{\varepsilon ^{\beta '}}} \Big ( \frac{\varepsilon |\nabla \varphi | ^2}{2}-\frac{W(\varphi )}{\varepsilon } \Big )_{+} \tilde{\rho }\, dxd\lambda \nonumber \\&\quad \le \int _{2\varepsilon ^{2\beta '}} ^t \frac{10 \varepsilon ^{-\beta }\varepsilon ^{n\beta '} \omega _n }{2(s-\lambda )^{\frac{n+1}{2}} (\sqrt{4\pi })^{n-1}} \,d\lambda \le \frac{10 \varepsilon ^{\beta ' -\beta } \omega _n}{(\sqrt{8\pi })^{n-1}(n-1)}. \end{aligned}$$
(4.73)

On \(\Omega \setminus B_{\varepsilon ^{\beta '} }(y) \), by (4.51), \(s-t \ge 2\varepsilon ^{2\beta '}\) and computations similar to (4.37), we have

(4.74)

By (4.71)–(4.74) we obtain the desired estimate. \(\square \)

To utilize the formula (4.17), we next obtain the estimate for u.

Lemma 4.8

There exists depending only on np and q such that for any \(t_0, t_1\) with \(s>t_1>t_0 \ge 0\) we have

(4.75)

where (1) \(0<\hat{p}=\frac{2pq -2p-nq}{pq} \) when \(p<n\), (2) \(\hat{p}<\frac{q-2}{q}\) may be taken arbitrarily close to \(\frac{q-2}{q}\) when \(p=n\) (and depends on \(\hat{p})\), and (3) \(\hat{p}=\frac{q-2}{q}\) when \(p>n\).

Proof

First, consider the case \(p<n\). By the Hölder inequality, for \(l:=\frac{p(n-1)}{2(n-p)}\) (which is \(\ge 1\) due to (2.13)) we have

$$\begin{aligned} \int _{\Omega } \tilde{\rho }|u|^2 \, d\mu ^{\varepsilon } _t&\le \left( \int _{\Omega } |\eta ^{\frac{1}{2}}u|^{2l} \rho \, d\mu ^{\varepsilon } _t \right) ^{\frac{1}{l}} \left( \int _{B_{\frac{1}{2}}(y)} \rho \, d\mu ^{\varepsilon } _t \right) ^{\frac{l-1}{l}} \nonumber \\&\le (D(t))^{\frac{l-1}{l}} \left( \int _{\Omega } |u\eta ^{\frac{1}{2}}|^{2l} \rho \, d\mu ^{\varepsilon } _t \right) ^{\frac{1}{l}} \nonumber \\&\le (D(t))^{\frac{l-1}{l}} \left( \frac{1}{(4\pi (s-t))^{\frac{n-1}{2}}} \int _{\Omega } |u\eta ^{\frac{1}{2}}|^{2l} \, d\mu ^{\varepsilon } _t \right) ^{\frac{1}{l}}. \end{aligned}$$
(4.76)

By (4.76) and (2.11) we have

(4.77)

where . Hence by the Hölder inequality and (4.77) we obtain (with \(\Vert u\Vert := \Vert u\Vert _{L^{q}([t_0,t_1];(W^{1,p}(B_{\frac{1}{2}}(y))^n)}\))

We remark that \((s-t_0)^\iota - (s-t_1)^\iota \le (t_1-t_0)^\iota \) for \(\iota \in (0,1)\) and \( \frac{-(n-p)q}{p(q-2)} +1 \in (0,1)\). By setting , we obtain the desired estimate when \(p<n\). For \(p=n\), since \(W^{1,n}_{loc} \subset W^{1,p^{\prime }}_{loc}\) for \(p^{\prime }<n\), we repeat the same argument as above for p close to n. Note that \(\frac{2pq-2p-nq}{pq} \uparrow \frac{q-2}{q}\) as \(p\uparrow n\). This gives the estimate for \(p=n\) case. For \(p>n\), \(\sup _{B_{\frac{1}{2}}(y)}|\eta ^{\frac{1}{2}}u|\le c(n,p)\Vert u\Vert _{W^{1,p}(B_{\frac{1}{2}}(y))}\). Thus \(\int \tilde{\rho }|u|^2\, d\mu _t^{\varepsilon } \le c(n,p)D(t) \Vert u\Vert _{W^{1,p}(B_{\frac{1}{2}}(y))}^2\). This gives the desired estimate for \(p>n\). \(\square \)

Proposition 4.2

There exist depending only on n, depending only on \(n,\,p,\,q\) and depending only on with the following property. For \(t_0, t_1\) with \(T_1 \ge t_1 >t_0 \ge 0\) and \(t_1-t_0\le 1\), suppose and . Then, if , we have

(4.78)

where \(\hat{p}\) is as in Lemma 4.8.

Proof

First, for any \(s>t_0\), by direct computation and by the definition of \(D(t_0)\), we have

$$\begin{aligned} \int _{\Omega }\tilde{\rho }_{(y,s)}\, d\mu _{t_0}^{\varepsilon }\le D(t_0). \end{aligned}$$
(4.79)

Let be a constant defined by

(4.80)

By definition, depends only on n. Suppose that \(t_1\) satisfies the assumptions. Recalling the definition of \(D(t_1)\), we have the following three possibilities, (a) \(D(t_1)=\mu _{t_1}^{\varepsilon }(\Omega )\), (b) there exists \(B_r(y)\subset \Omega \) such that \(D(t_1)=\frac{1}{\omega _{n-1} r^{n-1}} \mu ^{\varepsilon } _{t_1} (B_r (y))\) and \(r\ge \frac{1}{4}\), and (c) the same as (b) except that \(r<\frac{1}{4}\). For (b), we have the following

$$\begin{aligned} \frac{\omega _{n-1}}{4^{n-1}} D(t_1)\le \omega _{n-1} r^{n-1}D(t_1)=\mu _{t_1}^{\varepsilon }(B_r(y))\le \mu _{t_1}^{\varepsilon }(\Omega ). \end{aligned}$$

Since \(\omega _{n-1}/4^{n-1}\le 1\), either (a) or (b), we have

$$\begin{aligned} \frac{\omega _{n-1}}{4^{n-1}} D(t_1)\le \mu _{t_1}^{\varepsilon }(\Omega ). \end{aligned}$$
(4.81)

Then, by (4.39), we obtain with (4.81) that

(4.82)

where \(\Vert u\Vert :=\Vert u\Vert _{L^q([t_0,t_1];(W^{1,p}(\Omega ))^n)}\). By (4.80), , thus (4.82) shows

(4.83)

This is the conclusion deduced from (a) and (b). Next consider the case (c). Let \(s=t_1 +r^2\). By (4.17), (4.70), (4.75) and (4.39), we have

(4.84)

We compute using \(\eta =1\) on \(B_{\frac{1}{4}}(y)\) and \(r\le \frac{1}{4}\) that

(4.85)

where \(s=t_1+r^2\), the properties of \(t_1\) and are used. By (4.79), (4.84) and (4.85) give (using also \(t_1-t_0\le 1\))

(4.86)

Since \(D(t_0)\ge 1\) by definition, we may restrict \(\varepsilon \) depending on (see Lemma 4.7) so that , for example. Now, examining the dependence of constants, we obtain (4.78) from (4.83) and (4.86) by choosing an appropriate . Here we also use \(\hat{p}< 2-\frac{2}{q}\) and \(t_1-t_0\le 1\). \(\square \)

Proof of Theorem 4.1

We first choose \(0<T_b\le 1\) so that

(4.87)

holds. Due to the dependence of , \(T_b\) depends only on . Then set

(4.88)

so that \(D_1\) depends only on . Finally restrict as in Proposition 4.2. Now we claim that

(4.89)

holds for all \(t\in [0,T]\), thus proving \(D(t)\le D_1\) for all \(t\in [0,T]\) and \(T_1=T\). Suppose there exists \(0<t\le T\) such that (4.89) fails. Then there must exist some \(0<T_1<T\) such that for all \(t\in [0,T_1]\) and . Note that \(D(t)\le D_1\) for \(t\in [0,T_1]\), satisfying (4.38). If \(T_1<T_b\), we apply Proposition 4.2 with \(t_0=0\) and \(t_1=T_1\). We have and . Thus (4.78) shows

but this contradicts \(T_1<T_b\) and (4.87). Thus, we have \(T_1\ge T_b\). If \(T_1\in [T_b,2T_b)\), then . Thus there must exist \(t_0\in [T_b,T_1)\) such that and \(T_1-t_0<T_b\) (note that for all \(t\in [0,T_b)\)). By Proposition 4.2 with \(t_1=T_1\), we have , again contradicting \(T_1-t_0<T_b\) and (4.87). Continuing this manner, we conclude that \(T_1=T\), which is a contradiction. Thus we proved that (4.89) holds for all \(t\in [0,T]\). Also this concludes the proof of Theorem 4.1. \(\square \)

Since we proved \(T=T_1\), i.e., the assumption (4.38) is true for all [0, T], all the estimates in this section hold with \(T_1\) replaced by T. In particular, we have the following monotonicity formula which follows from (4.17), (4.75) and (4.70).

Theorem 4.2

Under the same assumptions of Theorem 4.1, if \(\varepsilon <\epsilon _1\) and for \(s>t_1>t_0\), \(t_0, t_1\in [0,T]\), and \(y\in \Omega \) we have

(4.90)

where \(\tilde{\rho }=\tilde{\rho }_{(y,s)}(x,t)\) and \(\xi _{\varepsilon }\) are defined as in (4.15) and (4.16), and \(\hat{p}\) is as in Lemma 4.8.

The point of the right-hand side is that it is bounded independent of \(\varepsilon \), and it can be made arbitrarily small when \(\varepsilon \rightarrow 0\) and \(t_0 \rightarrow t_1\).

5 Existence of limit measures

In this section we construct a sequence of approximate diffused interface solution for (1.2), given any bounded hypersurface \(M_0=\partial \Omega _0\) which is \(C^1\), and any vector field u satisfying (2.15). We then prove that we may extract a subsequence which converges to a family of Radon measures \(\{\mu _t\}_{t\ge 0}\).

We first construct a convergent sequence of domains \(\Omega _0^i\) with \(C^{\infty }\) boundary \(M_0^i\) which converges in \(C^1\) topology. This can be carried out by locally representing \(M_0\) by a \(C^1\) graph and by some suitable mollification. Let \(d_i\) be the signed distance function to \(M_0^i\) which is positive inside of \(\Omega _0^i\), and which is smooth in some \(r_i\)-neighborhood of \(M_0^i\). Let \(h_i\in C^{\infty }({\mathbb R})\) be a monotone increasing function such that \(h_i(s)=s\) for \(0\le s\le r_i/3\), \(h_i(s)=r_i/2\) for \(s>2r_i/3\), \(h_i^{\prime }(s)\le 1\) for \(s>0\) and \(h_i(s)=-h_i(-s)\) for \(s<0\). Then define \(\tilde{d}_i(x):=h_i(d_i(x))\) for \(x\in \Omega \). We next choose a sequence of \(\varepsilon _i>0\) so that

$$\begin{aligned} \lim _{i\rightarrow \infty } \sqrt{\varepsilon _i}/r_i=0. \end{aligned}$$
(5.1)

We define the initial data \((\varphi _{\varepsilon _i})\) differently depending on \(\Omega ={\mathbb T}^n\) or \({\mathbb R}^n\) as follows.

For \(\Omega ={\mathbb T}^n\), we define

$$\begin{aligned} (\varphi _{\varepsilon _i})_0:=\Psi \left( \frac{\tilde{d}_i(x)}{\varepsilon _i}\right) . \end{aligned}$$
(5.2)

Here and in the following, \(\Psi \) is the solution for \(\Psi ^{\prime \prime }=W^{\prime }(\Psi )\) (and \(\Psi ^{\prime }=\sqrt{2W(\Psi )}\)) with \(\Psi (0)=0\). For \(\Omega ={\mathbb R}^n\), we will truncate the function to be \(-1\) outside of a compact set as follows. Due to the definition, note that for \(x\in {\mathbb R}^n\) with \(\mathrm{dist}(x,\Omega _0^i)\ge 2r_i/3\), we have \({\tilde{d}}_i(x)=-r_i/2\). Choose a sufficiently large \(R>0\) such that

$$\begin{aligned} \{x : \mathrm{dist}(x,\Omega _0^i)\le 2r_i/3\}\subset B_R \end{aligned}$$
(5.3)

for all i. Then we have \({\tilde{d}}_i(x)=-r_i/2\) on \({\mathbb R}^n{\setminus } B_R\). Let \(g:{\mathbb R}^+\rightarrow [0,1]\) be a smooth decreasing function such that \(g(r)=1\) for \(0\le r\le R\), \(g(r)=0\) for \(R+1\le r<\infty \) and \(|g^{\prime }|\le 2\). Define

$$\begin{aligned} (\varphi _{\varepsilon _i})_0 (x):=g(|x|)\Psi \left( \frac{{\tilde{d}}_i (x)}{\varepsilon _i}\right) + g(|x|)-1. \end{aligned}$$
(5.4)

Then \((\varphi _{\varepsilon _i})_0(x)=\Psi \left( \frac{{\tilde{d}}_i(x)}{\varepsilon _i}\right) \) on \(B_R\), and it smoothly changes from \(\Psi (-r_i/2\varepsilon _i)\) to \(-1\) as |x| increases from R to \(R+1\). We may show from \(\Psi ^{\prime }=\sqrt{2W(\Psi )}\) that \(0<\Psi (-r_i/2\varepsilon _i)+1\le c \exp (-c^{\prime }r_i/\varepsilon _i)\) for some positive constants \(c,c^{\prime }\) depending only on W. Thus the difference between \((\varphi _{\varepsilon _i})_0\) and \(-1\) is exponentially small on \(B_{R+1}{\setminus } B_R\) by (5.1), and \((\varphi _{\varepsilon _i})_0(x)=-1\) on \({\mathbb R}^n {\setminus } B_{R+1}\).

For both cases, one can check that (4.7) is satisfied for \((\varphi _{\varepsilon _i})_0\) with some i-independent , where we may need to take a smaller \(\varepsilon _i\) depending on the growth of \(C^3\) norm of the graph functions representing \(M_0^i\). We fix \(\beta \)

$$\begin{aligned} \beta =\frac{1}{4}, \end{aligned}$$
(5.5)

though any \(0<\beta <1/2\) can be chosen. Using the fact that \(\Psi \) solves \(\Psi ^{\prime }=\sqrt{2W(\Psi )}\) and \(|\nabla \tilde{d}_i|\le 1\), one can check that (4.9) is satisfied for all i. We may also assume that

$$\begin{aligned}&\lim _{i\rightarrow \infty }\int _{\Omega } \Big | \frac{(\varphi _{\varepsilon _i})_0+1}{2}-\chi _{\Omega _0}\Big |\, dx=0,\nonumber \\&\lim _{i\rightarrow \infty }\big (\frac{\varepsilon _i |\nabla (\varphi _{\varepsilon _i})_0|^2}{2}+ \frac{W((\varphi _{\varepsilon _i})_0)}{\varepsilon _i}\big )\, dx=\sigma \Vert \nabla \chi _{\Omega _0}\Vert =\sigma {\mathcal H}^{n-1} \lfloor _{M_0} \end{aligned}$$
(5.6)

where the second identity is in the sense of measure convergence. We may also assume, due to the assumption that \(M_0\) is \(C^1\), that we have some \(D_0\) depending on \(M_0 \) such that D(0) as in (4.5) corresponding to \((\varphi _{\varepsilon _i})_0\) is uniformly bounded by \(D_0\) independent of i.

We next let \(T_i=i\) so that \(\lim _{i\rightarrow \infty }T_i=\infty \), and let \(\{u_{i}\}_{i=1}^{\infty }\) be a sequence of \(C^{\infty }\) vector fields with compact support such that \( \Vert u_{i}-u\Vert _{L^q([0,T_i];(W^{1,p}(\Omega ))^n)}\rightarrow 0\) as \(i\rightarrow \infty \), which can be constructed by the standard density argument. Then for each i we associate j(i) so that (4.10) is satisfied, i.e.,

$$\begin{aligned} \sup _{\Omega \times [0,T_i]}\{|u_{i}|,\, \varepsilon _{j(i)}|\nabla u_{i} |\} \le \varepsilon _{j(i)}^{-\beta } \end{aligned}$$
(5.7)

for all i, and at the same time, \(\varepsilon _{j(i)}<\epsilon _1\) where \(\epsilon _1\) is determined by Theorem 4.1 corresponding to \(D_0\), \(T=T_i\) and . We relabel \(\varepsilon _{j(i)}\) as \(\varepsilon _i\) and \(u_{i}\) as \(u_{\varepsilon _i}\).

With these choices, for each \(i\in {\mathbb N}\), we solve (4.2) and (4.3) on \(\Omega \times [0,T_i]\) with initial data \((\varphi _{\varepsilon _i})_0\) and u replaced by \(u_{\varepsilon _i}\). For \(\Omega ={\mathbb T}^n\), the standard parabolic PDE theory shows the existence of classical solution which we denote \(\varphi _{\varepsilon _i}\). The maximum principle shows (4.6). Due to the choice of \(\varepsilon _i\), for each fixed \(T>0\), we have all the assumptions of Theorem 4.1 satisfied on [0, T] for all sufficiently large i, thus we have (4.13). The same can be said about Theorem 4.2. For \(\Omega ={\mathbb R}^n\) and for each fixed i, we construct the solution by domain approximation. Namely, for each \(k\in {\mathbb N}\) with \(k>3R\) (where R is defined in (5.3)), solve

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l} \partial _t\varphi +u_{\varepsilon _i}\cdot \nabla \varphi =\Delta \varphi -\frac{W^{\prime }(\varphi )}{\varepsilon _i^2} &{}\, \text{ on } B_k\times [0,T_i], \\ \varphi =(\varphi _{\varepsilon _i})_0 &{}\, \text{ on } B_k\times \{0\}, \\ \varphi =-1 &{}\, \text{ on } \partial B_k\times [0,T_i]. \end{array}\right. \end{aligned}$$
(5.8)

By the standard parabolic existence theory, there exists a classical solution which we denote by \(\varphi _{\varepsilon _i,k}\). By the maximum principle, we have \(-1\le \varphi _{\varepsilon _i,k}<1\). We claim that

$$\begin{aligned} \varphi _{\varepsilon _i,k}(x,t)<\Psi \left( \frac{3R+t\Vert u_{\varepsilon _i}\Vert _{L^{\infty }}-|x|}{\varepsilon _i}\right) =:\psi _{\varepsilon _i}(x,t) \end{aligned}$$
(5.9)

for all k by the maximum principle. To see this, on \(\partial B_k\times [0,T_i]\), we have \(\varphi _{\varepsilon _i,k}(x,t)=-1<\psi _{\varepsilon _i}(x,t)\) by (5.8) and (5.9). On \(B_k\times \{0\}\) where \(\varphi _{\varepsilon _i,k}=(\varphi _{\varepsilon _i})_0\), we may check \(\psi _{\varepsilon _i}>(\varphi _{\varepsilon _i})_0\) as follows. When \(|x|\ge R+1\), \(\psi _{\varepsilon _i}(x,0)>-1=(\varphi _{\varepsilon _i})_0(x)\), and when \(R\le |x|\le R+1\), \((\varphi _{\varepsilon _i})_0(x)\approx -1<\Psi (0)<\psi _{\varepsilon _i}(x,0)\). When \(|x|< R\),

$$\begin{aligned} (\varphi _{\varepsilon _i})_0(x)\le \Psi \left( \frac{{\tilde{d}}_i(x)}{\varepsilon _i}\right) < \Psi \left( \frac{2R}{\varepsilon _i}\right) \le \Psi \left( \frac{3R-|x|}{\varepsilon _i}\right) = \psi _{\varepsilon _i}(x,0) \end{aligned}$$

since \(|{\tilde{d}}_i(x)|\le |d_i(x)|< 2R\) from \(M_0^i\subset B_R\). \(\psi _{\varepsilon _i}\) is a super-solution since, for \(|x|\ne 0\),

$$\begin{aligned}&\partial _t\psi _{\varepsilon _i}+u_{\varepsilon _i}\cdot \nabla \psi _{\varepsilon _i}-\Delta \psi _{\varepsilon _i}+\frac{W^{\prime }(\psi _{\varepsilon _i})}{\varepsilon _i^2}\\&\quad =\frac{\Psi ^{\prime }(\psi _{\varepsilon _i})}{\varepsilon _i}\left( \Vert u_{\varepsilon _i}\Vert _{L^{\infty }} +\frac{n-1}{|x|}-\frac{x}{|x|}\cdot u_{\varepsilon _i}\right) >0. \end{aligned}$$

We note that \(\varphi _{\varepsilon _i,k}\) cannot touch \(\psi _{\varepsilon _i}\) from below at \(|x|=0\). Thus we may prove (5.9) by the standard argument of the maximum principle. Now let \(k\rightarrow \infty \) and we may prove that \(\varphi _{\varepsilon _i,k}\) converge to a solution \(\varphi _{\varepsilon _i}\) of (4.2) on \({\mathbb R}^n\times [0,T_i]\) satisfying \(-1\le \varphi _{\varepsilon _i}\le \psi _{\varepsilon _i}\). Hence, we have (4.6). Due to (5.9), for each fixed i, we have the exponential approach of \(\varphi _{\varepsilon _i}\) to \(-1\) as \(|x|\rightarrow \infty \), which is (4.8). Thus, in the case of \(\Omega ={\mathbb R}^n\), we have all the assumptions of Theorem 4.1 satisfied and we may obtain the desired conclusion.

We next prove that there exists a family of Radon measures \(\{\mu _t \}_{t\ge 0}\) such that, after choosing a subsequence, \(\mu _t ^{\varepsilon _{i_j}} \rightarrow \mu _t \) as \(j\rightarrow \infty \) for all \(t\ge 0\).

Proposition 5.1

Corresponding to \(T>0\) and \(\phi \in C_c^2 (\Omega ; \mathbb {R}^+)\), there exists depending only on and \(\Vert \phi \Vert _{C^2(\Omega )}\) such that, for all i with \(i>T\) and \(\mu ^{\varepsilon _i}_t\) constructed as above, the function

(5.10)

of t is monotone decreasing on [0, T].

Proof

By (4.2) and integration by parts we have

$$\begin{aligned}&\frac{d}{dt} \mu _t ^{\varepsilon _i} (\phi ) = \int _\Omega -\varepsilon _i \phi \left( \Delta \varphi _{\varepsilon _i} -\frac{W^{\prime }(\varphi _{\varepsilon _i} )}{\varepsilon _i ^2} \right) ^2 -\varepsilon _i \nabla \phi \cdot \nabla \varphi _{\varepsilon _i} \left( \Delta \varphi _{\varepsilon _i} -\frac{W^{\prime }(\varphi _{\varepsilon _i} )}{\varepsilon _i ^2} \right) \nonumber \\&\quad +\,\varepsilon _i \phi \left( \Delta \varphi _{\varepsilon _i} -\frac{W^{\prime }(\varphi _{\varepsilon _i} )}{\varepsilon _i ^2} \right) u_{\varepsilon _i}\cdot \nabla \varphi _{\varepsilon _i} + \varepsilon _i (\nabla \varphi _{\varepsilon _i} \cdot \nabla \phi ) (u_{\varepsilon _i}\cdot \nabla \varphi _{\varepsilon _i} ) \, dx. \end{aligned}$$
(5.11)

By the Cauchy-Schwarz inequality and estimating as in the proof of Lemma 4.4, we have

$$\begin{aligned} \frac{d}{dt} \mu _t^{\varepsilon _i}(\phi )&\le \int _\Omega \varepsilon _i|\nabla \varphi _{\varepsilon _i}|^2 \frac{|\nabla \phi |^2}{\phi } +\varepsilon _i \phi |u_{\varepsilon _i}|^2|\nabla \varphi _{\varepsilon _i}|^2\, dx \nonumber \\&\le 4(\sup \Vert \nabla ^2 \phi \Vert )D(t) + \sup |\phi |c(n,p)D(t) \Vert u_{\varepsilon _i} (\cdot , t) \Vert ^2 _{W^{1,p} (\Omega )}.\qquad \end{aligned}$$
(5.12)

Thus with a suitable constant independent of i and Theorem 4.1, we have (5.10). \(\square \)

Proposition 5.2

(See [28, 33]) There exist a family of Radon measures \(\{\mu _t\}_{t\ge 0}\) and a subsequence (denoted by the same index) such that for all \(t\ge 0\),

$$\begin{aligned} \lim _{i\rightarrow \infty }\mu _t ^{\varepsilon _{i}}=\mu _t \quad \text {as Radon measures.} \end{aligned}$$

Proof

Fix \(T>0\) and \(\phi \in C_c^2 (\Omega ; \mathbb {R}^+)\). By the Cauchy-Schwarz inequality and \(q>2\),

$$\begin{aligned} \int _{t_1} ^{t_2} \Vert u_{\varepsilon _i}(\cdot ,s)\Vert ^2_{W^{1,p}(\Omega )} \, ds \le (t_2 -t_1 )^{\frac{q-2}{q}} \Vert u_{\varepsilon _i} \Vert _{L^q ([0,T]; (W^{1,p} (\Omega ) ) ^n ) } ^2 \end{aligned}$$

for \(0\le t_1<t_2\le T\). Hence the last term of (5.10) is uniformly bounded in Hölder continuous norm with exponent \(\frac{q-2}{q}\). Thus by the Ascoli-Arzelà compactness theorem, there exists a subsequence which converges uniformly on [0, T]. By the monotone decreasing property due to Proposition 5.1, we can choose a subsequence such that \(\mu ^{\varepsilon _i} _{t} (\phi )\) converges on a co-countable set \(B(\phi )\subset [0,T]\). Choose a countable set \(\{\phi _k \}_{k=1} ^\infty \subset C_c^2 (\Omega ; \mathbb {R} ^+)\) which is dense in \(C_c(\Omega ;\mathbb {R}^+)\). By the similar argument we can choose a subsequence such that \(\mu _t ^{\varepsilon _i} (\phi _k )\) converges on a co-countable set \(B=\cap _{k=1} ^\infty B(\phi _k)\). For any \(k\ge 1\) we define \(\mu _t (\phi _k)= \lim _{i\rightarrow \infty } \mu _t ^{\varepsilon _i} (\phi _k)\) for \(t\in B\). Then we may define \(\mu _t (\phi )= \lim _{i\rightarrow \infty } \mu _t ^{\varepsilon _i} (\phi )\) for any \(\phi \in C_c(\Omega ;\mathbb {R}^+)\) and for any \(t\in B\) since \(\{\phi _k \}_{k=1} ^\infty \) is dense in \(C_c(\Omega ;\mathbb {R}^+)\) and the measures are uniformly bounded. Since \([0,T]{\setminus } B\) is countable, we can choose a subsequence so that \(\mu _t ^{\varepsilon _i} (\phi _k)\) converges on \([0,T]{\setminus } B\) for any k. Thus we have the limit \(\mu _t (\phi )\) for all \(\phi \in C_c(\Omega ; \mathbb {R}^+) \) and for all \(t\in [0,T]\). Now by letting \(T\rightarrow \infty \) and by diagonal argument, we may choose a subsequence so that \(\mu _t^{\varepsilon _i}(\phi )\) converges for all \(t\ge 0\) and \(\phi \in C_c(\Omega ; \mathbb {R}^+)\). \(\square \)

We also denote, after choosing a further subsequence,

Definition 5.1

Let \(\mu \) be a measure on \(\Omega \times [0,\infty )\) such that \(d\mu =\lim _{j\rightarrow \infty }d\mu _{t}^{\varepsilon _j}dt\) locally as measures.

Since \(\sup _{t\in [0,T]}\mu _t^{\varepsilon _j}(\Omega )\) is bounded uniformly in j for all T, the dominated convergence theorem shows \(d\mu =d\mu _t\, dt\). On the other hand, note that \(\mathrm{spt}\mu \) may not be the same as \(\cup _{t\ge 0}\mathrm{spt}\mu _\mathrm{t}\times \{\mathrm{t}\}\). In the following section we also use the following notation.

Definition 5.2

Define \((\mathrm{spt}\mu )_\mathrm{t}\subset \Omega \) as \((\mathrm{spt}\mu )_\mathrm{t}:=\{\mathrm{x}\in \Omega \,:\, (\mathrm{x},\mathrm{t})\in \mathrm{spt}\mu \}\).

We have the following inclusion.

Lemma 5.1

For all \(t>0\),

$$\begin{aligned} \mathrm{spt}\, \mu _t\subset (\mathrm{spt}\, \mu )_t. \end{aligned}$$
(5.13)

Proof

Suppose \(x\in \mathrm{spt}\, \mu _{t_0}\) and assume for a contradiction that \((x,t_0)\notin \mathrm{spt}\, \mu \). Then there exists \(r>0\) such that \(\mu (B_r(x)\times (t_0-r^2,t_0+r^2))=0\). Take \(\phi \in C^2_c(B_r(x);{\mathbb R}^+)\) with \(\phi =1\) on \(B_{r/2}(x)\). Since \(x\in \mathrm{spt}\, \mu _{t_0}\), we have \(\mu _{t_0}(\phi )>0\). By Proposition 5.1 and , is monotone decreasing. Thus one sees that for all sufficiently small \(h>0\), we have \(\mu _{t_0-h}(\phi )\ge \mu _{t_0}(\phi )-o(1)\ge \mu _{t_0}(\phi )/2\) where \(o(1)\rightarrow 0\) as \(h\rightarrow 0\). Since \(d\mu =d\mu _t dt\), this contradicts \((x,t_0)\notin \mathrm{spt}\, \mu \). \(\square \)

6 Rectifiability of limit measures

Throughout this section, let \(\varphi _{\varepsilon _i}\), \(\mu _t^{\varepsilon _i}\), \(u_{\varepsilon _i}\), \(\mu _t\) and \(\mu \) be as in Sect. 5 and let \({\tilde{\rho }}_{(y,s)}\), \(e_{\varepsilon _i}\) and \(\xi _{\varepsilon _i}\) be as in (4.15) and (4.16). We fix arbitrary \(T>0\) and let be as in (4.11) with this T. Note that all the estimates in the previous two sections hold in [0, T] for all sufficiently large i (such that \(T_i>T\)). For simplicity we often drop i from these quantities. In this section we prove that for a.e. \(t\ge 0\), there exists a countably \((n-1)\)-rectifiable set \(M_t\) such that \(\mu _t=\theta (x,t) {\mathcal H}^{n-1}\lfloor _{M_t}\), where \(\theta \) is a non-negative \({\mathcal H}^{n-1}\) measurable function. The important ingredient for the proof is the vanishing of the discrepancy measure defined below. As stated in the introduction, the content of this section is based on [28] with some modifications coming from the transport term. First we note

Lemma 6.1

Let \(\varphi _{\varepsilon _i}\) and \(\mu _t^{\varepsilon _i}\) be the sequences constructed in Sect. 5. Then there exist a subsequence (denoted by the same index) and a Radom measure \(|\xi |\) such that

$$\begin{aligned} \lim _{i\rightarrow \infty }\int _{t_0}^{t_1}\int _{\Omega }|\xi _{\varepsilon _i}|\phi \, dxdt=\int _{t_0}^{t_1} \int _{\Omega }\phi \, d|\xi | \end{aligned}$$
(6.1)

for all \(0\le t_0<t_1<\infty \) and \(\phi \in C_c(\Omega \times [0,\infty ))\).

Due to the uniform estimate \(\sup _{i\in {\mathbb N}}\sup _{t\in [0,T]} \mu _t^{\varepsilon _i}(\Omega )\) for any fixed T, the existence of such subsequence follows from the weak compactness of measures. Since \(|\xi |\) measures the difference between the two terms in \(\mu _t^{\varepsilon _i}\) in the limit, we may call \(|\xi |\) as a discrepancy measure. Unlike \(\mu _t^{\varepsilon _i}\), which converges to \(\mu _t\) for all \(t\ge 0\), note that we do not claim any convergence of \(|\xi _{\varepsilon _i}(\cdot , t)|\, dx\) in general. Instead, we will prove

Theorem 6.1

\(|\xi |=0\) on \(\Omega \times [0,\infty )\).

6.1 Forward density lower bound

Lemma 6.2

There exist \(1>\gamma _1,\,\eta _1>0\) depending only on n, , , p, q, T, W, \(D_0\) and \(1>\eta _2>0\) depending only on n, , W with the following property. Given \(0\le t<s<T/2\) with \(s-t\le \eta _1\), set \(r:=\sqrt{2(s-t)}\) and \(t^{\prime }:=s+r^2/2\). If \(x\in \Omega \) satisfies

$$\begin{aligned} \int _{\Omega } {\tilde{\rho }} _{(y,s)}(x,t) \, d\mu _s (y)<\eta _2, \end{aligned}$$
(6.2)

then \((B_{\gamma _1 r}(x)\times \{t^{\prime }\})\cap \mathrm{spt}\mu =\emptyset \).

Remark 6.1

Note that \(t<s<t^{\prime }<T\) with \(s=\frac{t^{\prime }+t}{2}\). The Lemma says that, unless there is at least a certain amount of measure, there would be no measure later in the neighborhood. The monotonicity formula (4.90) plays a crucial role for such conclusion.

Proof

Assume for a contradiction that \((x^{\prime },t^{\prime })\in \mathrm{spt}\mu \) for some \(x^{\prime }\in B_{\gamma _1 r}(x)\) under the assumption of (6.2), where \(\gamma _1\) will be chosen later. Then there is a sequence \(\{(x_j,t_j)\}_{j=1}^{\infty }\) and \(\{\varepsilon _{i(j)}\}_{j=1}^{\infty }\) such that \(\lim _{j\rightarrow \infty }(x_j,t_j)=(x^{\prime },t^{\prime })\) and \(|\varphi _{\varepsilon _{i(j)}}(x_j,t_j)|< \alpha \) for all j. We relegate its proof to Lemma 6.3. We re-index i(j) as j. Then just as in the proof of (4.45), there exists such that

$$\begin{aligned} 3\eta _{2}&\le \int _{B_{\varepsilon _j}(x_j)} \frac{W(\varphi _{\varepsilon _j}(y,t_j))}{\varepsilon _{j}} \tilde{\rho }_{(x_j, t_j+\varepsilon _j^2)}(y,t_j)\, dy\nonumber \\&\le \int _{\Omega }\tilde{\rho }_{(x_j,t_j+\varepsilon _j^2)} (y,t_j)\, d\mu _{t_j}^{\varepsilon _j}(y). \end{aligned}$$
(6.3)

We use Theorem 4.2. By restricting \(t^{\prime }-s\le \eta _1\) small so that

in (4.90) for all sufficiently large j, we obtain

(6.4)

Letting \(j\rightarrow \infty \), we obtain by (6.3) and (6.4)

$$\begin{aligned} 2\eta _2\le \int _{\Omega } \tilde{\rho }_{(x^{\prime },t^{\prime })}(y,s)\,d \mu _{s}(y). \end{aligned}$$
(6.5)

We next want to change the center of the kernel from \(x^{\prime }\) to x. Fix \(0<\delta <1/2\) so that \(2\delta D_1<\eta _2\). Corresponding to \(\delta \), a direct computation shows that we may choose \(\gamma _1>0\) so that

$$\begin{aligned} \int _{\Omega } \tilde{\rho }_{(x^{\prime },t^{\prime })}(y,s)\, d\mu _s(y)\le \delta D_1+(1+\delta ) \int _{\Omega }\tilde{\rho }_{(x,t^{\prime })}(y,s)\, d\mu _s(y) \end{aligned}$$
(6.6)

if \(|x-x^{\prime }|\le \gamma _1 r\). By the choice of \(\delta \), (6.5) and (6.6) show

$$\begin{aligned} \eta _2\le \int _{\Omega }\tilde{\rho }_{(x,t^{\prime })}(y,s)\, d\mu _s(y). \end{aligned}$$
(6.7)

Finally, since \(t^{\prime }-s=s-t\), we have \(\tilde{\rho }_{(x,t^{\prime })}(y,s)=\tilde{\rho }_{(y,s)}(x,t)\). This is a contradiction to (6.2). Thus we proved \((x^{\prime },t^{\prime })\notin \mathrm{spt}\mu \). \(\square \)

Lemma 6.3

Assume \((x^{\prime },t^{\prime })\in \mathrm{spt}\mu \). Then there are sequences \(\{(x_j,t_j)\}_{j=1}^{\infty }\) and \(\{\varepsilon _{i(j)}\}_{j=1}^{\infty }\) such that \(\lim _{j\rightarrow \infty }(x_j,t_j)=(x^{\prime },t^{\prime })\) and \(|\varphi _{\varepsilon _{i(j)}}(x_j,t_j)|< \alpha \) for all j.

Proof

If the claim were not true, there would be \(0<r_0<1/2\) such that

$$\begin{aligned} \inf _{B_{r_0}(x^{\prime })\times [t^{\prime }-r_0^2,t^{\prime }+r_0^2]}|\varphi _{\varepsilon _i}|\ge \alpha \end{aligned}$$
(6.8)

for all sufficiently large i. Let \(\phi \in C^{2}_c(B_{r_0}(x^{\prime }))\) be a function such that \(|\nabla \phi |\le 2/r_0\), \(0\le \phi \le 1\) on \(B_{r_0}(x^{\prime })\) and \(\phi =1\) on \(B_{r_0/3}(x^{\prime })\). Then the same computations following (4.60) using (6.8) show

$$\begin{aligned} \frac{d}{dt}\int _{\Omega }\frac{1}{2} |\nabla \varphi _{\varepsilon _i}|^2\phi ^2\, dx\le -\frac{\kappa }{2\varepsilon _i^2}\int _{\Omega }|\nabla \varphi _{\varepsilon _i}|^2\phi ^2\, dx+16 r_0^{-2} \int _{\mathrm{spt}\phi }|\nabla \varphi _{\varepsilon _i}|^2\, dx \end{aligned}$$

for \(t\in [t^{\prime }-r_0^2, t^{\prime }+r_0^2]\). Writing \(M_i:=\sup _{\lambda \in [t^{\prime }-r_0^2,t^{\prime }+r_0^2]}\int _{\mathrm{spt}\phi } \frac{1}{2} |\nabla \varphi _{\varepsilon _i}(x,\lambda )|^2\, dx\), and proceeding similarly as in (4.65), we obtain

$$\begin{aligned} \int _{\Omega }\frac{1}{2} |\nabla \varphi _{\varepsilon _i}(\cdot ,\lambda )|^2\phi ^2\,dx \le \left( e^{-\frac{\kappa }{\varepsilon _i^2}(\lambda -t^{\prime }+r_0^2)}+ \frac{32\varepsilon _i^2}{r_0^2 \kappa }\right) M_i \end{aligned}$$
(6.9)

for \(\lambda \in [t^{\prime }-r_0^2,t^{\prime }+r_0^2]\). Since \(\varepsilon _i M_i\) is uniformly bounded, we see from (6.9) that

$$\begin{aligned} \lim _{i\rightarrow \infty }\sup _{\lambda \in [t^{\prime }-\frac{r_0^2}{2},t^{\prime }+r_0^2]}\int _{\Omega }\frac{\varepsilon _i}{2} |\nabla \varphi _{\varepsilon _i}(\cdot ,\lambda )|^2 \phi ^2\, dx=0. \end{aligned}$$
(6.10)

Next, due to (6.8) and the continuity of \(\varphi _{\varepsilon _i}\), we may assume \(1\ge \varphi _{\varepsilon _i}\ge \alpha \) on \(B_{r_0}(x^{\prime })\times [t^{\prime }-r_0^2,t^{\prime }+r_0^2]\) without loss of generality. Otherwise, we have \(-1\le \varphi _{\varepsilon _i}\le -\alpha \) and we may argue similarly. In the following, we use

$$\begin{aligned} W^{\prime }(s)(s-1)\ge (s-1)^2 \kappa \ge c(W)W(s) \end{aligned}$$
(6.11)

for some \(c(W)>0\) if \(s\in [\alpha ,1]\). Multiply the equation (4.2) by \((\varphi _{\varepsilon _i} -1)\phi ^2\) and integrate over \(Q:=\Omega \times [t^{\prime }-r_0^2,t^{\prime }+r_0^2]\). By integration by parts, the Cauchy-Schwarz inequality, \(|\varphi _{\varepsilon _i}-1|\le 1\) and (6.11), one obtains

$$\begin{aligned} c(W)\int _{Q}\phi ^2\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i^2}\, dxdt \le \frac{1}{2} \int _{\Omega } \phi ^2\, dx+\int _{Q}2|\nabla \phi |^2+\frac{1}{2} |u_{\varepsilon _i}|^2\phi ^2\, dxdt.\qquad \end{aligned}$$
(6.12)

Since the right-hand side of (6.12) is uniformly bounded, we obtain

$$\begin{aligned} \lim _{i\rightarrow \infty } \int _{Q}\phi ^2\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i}\, dxdt=0. \end{aligned}$$
(6.13)

The estimates (6.10) and (6.13) show that

$$\begin{aligned} \lim _{i\rightarrow \infty }\int _{t^{\prime }-r_0^2/2}^{t^{\prime }+r_0^2} \mu _t^{\varepsilon _i}(\phi ^2)\, dt=0. \end{aligned}$$
(6.14)

By Fatou’s lemma, Proposition and (6.14), we have

$$\begin{aligned} \int _{t^{\prime }-r_0^2/2}^{t^{\prime }+r_0^2} \mu _t(\phi ^2)\, dt=0. \end{aligned}$$
(6.15)

This proves that \((x^{\prime },t^{\prime })\notin \mathrm{spt}\mu \). \(\square \)

Corollary 6.1

Let \(U \subset \Omega \) be open. For \(0< t\le T\), there exists depending only on with the property that

(6.16)

and

(6.17)

Proof

We only need to prove the result for every compact set \(K\subset U\). Set \(X_t = (\mathrm{spt}\mu )_\mathrm{t}\cap \mathrm{K}\). For any \((x,t)\in X_t\), by the same argument leading to (6.5), we have

$$\begin{aligned} 2\eta _2 \le \int _{\Omega } \tilde{\rho } _{(x,t)} (y,t-r^2) \, d\mu _{t-r^2}(y) \end{aligned}$$
(6.18)

for sufficiently small \(r>0\). For \(0<L<1/(2r)\), using the upper density ratio bound, we have

$$\begin{aligned}&\int _{\Omega {\setminus } B_{rL}(x)} \tilde{\rho } _{(x,t)} (y,t-r^2) \, d\mu _{t-r^2}(y)\nonumber \\&\quad \le D_1\omega _{n-1} (\pi )^{-\frac{n-1}{2}} \int _{L^2/4}^{\infty }s^{\frac{n-1}{2}}e^{-s}\, ds. \end{aligned}$$
(6.19)

Thus by choosing sufficiently large L depending only on \(n, D_1\) and \(\eta _2\), (6.18) and (6.19) show

$$\begin{aligned} \eta _2\le \int _{B_{rL}(x)}\tilde{\rho }_{(x,t)}(y,t-r^2)\, d\mu _{t-r^2}(y). \end{aligned}$$
(6.20)

Since \(\tilde{\rho }_{(x,t)}(\cdot ,t-r^2)\le (4\pi )^{-(n-1)/2}r^{-(n-1)}\), from (6.20) we obtain

$$\begin{aligned} (4\pi )^{\frac{n-1}{2}} r^{n-1} \eta _2\le \mu _{t-r^2}(B_{rL}(x)). \end{aligned}$$
(6.21)

Let \(\mathcal {B} =\{ \bar{B}_{rL} (x)\subset U \, | \, x\in X_t\}\) which is the covering of \(X_t \) by closed balls centered at \(x\in X_t\). By the Besicovitch covering theorem, there exist a finite sub-collection \(\mathcal {B}_1,\ldots , \mathcal {B}_{B(n)}\) such that each \(\mathcal {B}_i\) is a pairwise disjoint family of closed balls and

$$\begin{aligned} X_t \subset \cup _{i=1} ^{B(n)} \cup _{\bar{B}_{rL} (x_j) \in \mathcal {B}_i} \bar{B}_{rL} (x_j). \end{aligned}$$
(6.22)

Let \({\mathcal H}^{n-1}_{\delta }\) be defined as in [41], so that \({\mathcal H}^{n-1}=\lim _{\delta \downarrow 0}{\mathcal H}^{n-1}_{\delta }\). By the definition, (6.21) and (6.22) we obtain

$$\begin{aligned} \mathcal {H}^{n-1}_{2rL}(X_t )\le & {} \sum _{i=1}^{B(n)} \sum _{\bar{B}_{rL} (x_j) \in \mathcal {B}_i}\omega _{n-1} (rL)^{n-1} \\\le & {} \sum _{i=1}^{B(n)} \frac{\omega _{n-1} L^{n-1}}{(4\pi )^{\frac{n-1}{2}} \eta _2 } \sum _{\bar{B}_{rL} (x_j) \in \mathcal {B}_i} \mu _{t- r^2} (B_{rL} (x_j)) \\\le & {} \sum _{i=1}^{B(n)} \frac{\omega _{n-1}L^{n-1}}{(4\pi )^{\frac{n-1}{2}}\eta _2} \mu _{t-r^2} (U) = \frac{\omega _{n-1}L^{n-1}B(n)}{(4\pi )^{\frac{n-1}{2}}\eta _2} \mu _{t-r^2} (U). \end{aligned}$$

By setting to be the constant above and letting \(r\downarrow 0\), we obtain (6.16). The second inequality (6.17) follows immediately from (6.16) and Lemma 5.1. \(\square \)

Lemma 6.4

For \(1\le T<\infty \), let \(\eta _2\) be as in Lemma 6.2 corresponding to T. Define

$$\begin{aligned} Z_T :=\left\{ (x,t)\in ~\mathrm{spt}~\mu \ : \ 0\le t\le T/2,\, \limsup _{s \downarrow t} {\int }_{\Omega } {\tilde{\rho }}_{(y,s)}(x,t) \, d\mu _s (y)\le \eta _2/2 \right\} . \end{aligned}$$

Then we have \(\mu (Z_T)=0\).

Proof

For \(0<\tau \le \eta _1\), where \(\eta _1\) is as in Lemma 6.2, define

$$\begin{aligned} Z^{\tau } \!:=\!\left\{ (x,t)\in ~\mathrm{spt}~ \mu \,: 0\!\le \! t< T/2,\, \int _{\Omega } \tilde{\rho } _{(y,s)} (x,t) \, d\mu _s (y) \!<\! \eta _2, \,\,\forall s\!\in \! (t,t+\tau ] \right\} . \end{aligned}$$

Note that \(Z_T \subset \cup _{m=1}^{\infty }Z^{\tau _m}\) for some \(\{\tau _m\}_{m=1}^{\infty }\) with \(\lim _{m\rightarrow \infty } \tau _m=0\). Hence we only need to prove \(\mu (Z^{\tau })=0\). In the following we fix \(0<\tau \le \eta _1\). For \(0\le t\le T/2\) and \(x\in \Omega \), set

$$\begin{aligned} P_{\tau }(x,t):=\{(x^{\prime },t^{\prime })\,:\, \tau > |t-t^{\prime }|>\gamma _1^{-2}|x-x^{\prime }|^2\}, \end{aligned}$$
(6.23)

where \(\gamma _1\) is as in Lemma 6.2. For \((x,t)\in Z^{\tau }\), we use Lemma 6.2 to prove

$$\begin{aligned} P_{\tau }(x,t)\cap Z^{\tau }=\emptyset . \end{aligned}$$
(6.24)

Suppose for a contradiction that \((x^{\prime },t^{\prime })\in P_{\tau }(x,t)\cap Z^{\tau }\). Suppose first that \(t^{\prime }>t\). Set \(r:=\sqrt{t^{\prime }-t}\) and \(s:=(t^{\prime }+t)/2\) so that \(t^{\prime }=s+r^2/2\). Note that we have \(|x-x^{\prime }|<\gamma _1 r\) by \((x^{\prime },t^{\prime })\in P_{\tau }(x,t)\). Since \(s-t<\tau \le \eta _1\), we may apply Lemma 6.2 to conclude that \((x,t)\in Z^{\tau }\) implies \((x^{\prime },t^{\prime })\notin \mathrm{spt}\mu \), and in particular, \((x^{\prime },t^{\prime })\notin Z^{\tau }\), which is a contradiction. Next suppose that \(t^{\prime }<t\). We change the role of (xt) and \((x^{\prime },t^{\prime })\) in the previous case, and conclude that \((x^{\prime },t^{\prime })\in Z^{\tau }\) implies \((x,t)\notin Z^{\tau }\), which is again a contradiction. This proves (6.24). Next, for \((x_0,t_0)\in \Omega \times [\tau /2,T/2]\), define

$$\begin{aligned} Z^{\tau ,x_0,t_0}=Z^{\tau }\cap B_{\frac{1}{2}} (x_0) \times (t_0-\tau /2, t_0+\tau /2). \end{aligned}$$
(6.25)

Then \(Z^{\tau }\) can be covered by an at most countable union of \(Z^{\tau ,x_j,t_j}\) with a suitable choice of \(\{(x_j,t_j)\}\). Thus we only need to prove \(\mu (Z^{\tau ,x_0,t_0})=0\). With arbitrary \(0<r\le \gamma _1\sqrt{\tau }\), consider a family of closed balls \(\{\bar{B}_r(x)\}_{(x,t)\in Z^{\tau ,x_0,t_0}}\) and apply the Besicovitch covering theorem. Then we have a finite subfamily \(\bar{B}_r(x_1),\,\ldots ,\bar{B}_r(x_N)\) with \((x_j,t_j)\in Z^{\tau ,x_0,t_0}\) (\(j=1,\ldots ,N\)) and

$$\begin{aligned} \{x\in B_{\frac{1}{2}}(x_0)\,: \, (x,t)\in Z^{\tau ,x_0,t_0}\} \subset \cup _{j=1}^N \bar{B}_r(x_j),\quad N r^n \le 2 B(n) (1/2)^n.\qquad \end{aligned}$$
(6.26)

Note that for each \(j=1,\ldots , N\), by (6.24) and (6.25), we have

$$\begin{aligned} Z^{\tau ,x_0,t_0}\cap \bar{B}_r(x_j)\times (t_0 \!-\!\tau /2,t_0 \!+\!\tau /2) \!\subset \! \bar{B}_r(x_j)\times (t_0 \!-\!\tau /2,t_0 \!+\!\tau /2){\setminus } P_{\tau }(x_j,t_j).\nonumber \\ \end{aligned}$$
(6.27)

The inclusions (6.26) and (6.27) show

$$\begin{aligned} Z^{\tau ,x_0,t_0}\subset \cup _{j=1}^N\bar{B}_r(x_j)\times (t_0-\tau /2,t_0+\tau /2){\setminus } P_{\tau }(x_j,t_j). \end{aligned}$$
(6.28)

Since \(\bar{B}_r(x_j)\times (t_0-\tau /2,t_0+\tau /2){\setminus } P_{\tau }(x_j,t_j)\subset \bar{B}_r(x_j)\times [t_j-\gamma _1^{-2}r^2,t_j+\gamma _1^{-2}r^2]\), from (6.28) we obtain

$$\begin{aligned} Z^{\tau ,x_0,t_0}\subset \cup _{j=1}^N \bar{B}_r(x_j)\times [t_j-\gamma _1^{-2}r^2,t_j+\gamma _1^{-2}r^2]. \end{aligned}$$
(6.29)

Since \(d\mu =d\mu _t dt\), (6.29), (4.13) and (6.26) show

$$\begin{aligned} \mu (Z^{\tau ,x_0,t_0})&\le \sum _{j=1}^N \int _{t_j-\gamma _1^{-2}r^2}^{t_j+\gamma _1^{-2}r^2}\mu _t(\bar{B}_r(x_j))\, dt \le 2\omega _{n-1} D_1 r^{n+1}\gamma _1^{-2}N\nonumber \\&\le 2^{2-n} \omega _{n-1}B(n) D_1 r\gamma _1^{-2}. \end{aligned}$$
(6.30)

Since \(0<r\le \gamma _1\sqrt{\tau }\) is arbitrary, (6.30) shows \(\mu (Z^{\tau ,x_0,t_0})=0\). This concludes the proof. \(\square \)

6.2 Vanishing of \(\xi \)

First we remark the following

Lemma 6.5

For \(1\le T<\infty \) there exists depending only on n, , , p, q, T, W, \(D_0\) with the following property. For any \((y,s)\in \Omega \times (0,T)\), we have

(6.31)

Proof

In (4.90), set \(t_0=0\) and \(t_1=s-\epsilon \) for \(0<\epsilon <s\). We simply let \(\varepsilon _i\rightarrow 0\) and we set the supremum of the right-hand side of (4.90) (with no \(\varepsilon \) term) plus \(D_0\) (coming from the left-hand side) to be . Then letting \(\epsilon \rightarrow 0\), we obtain (6.31). \(\square \)

We are ready to prove Theorem 6.1.

Proof

We integrate (6.31) with respect to \(d\mu _s ds\) over \(\Omega \times (0,T)\) and use Fubini’s theorem to obtain

(6.32)

The finiteness of (6.32) shows

$$\begin{aligned} \int _{\Omega \times (t,T)} \frac{\tilde{\rho } _{(y,s)} (x,t)}{s-t}\, d\mu _s (y) ds <\infty \end{aligned}$$
(6.33)

for \(|\xi |\) a.e. \((x,t)\in \Omega \times (0,T)\). Next, we claim that, whenever (6.33) holds at (xt), we have

$$\begin{aligned} \lim _{s\downarrow t} \int _{\Omega } \tilde{\rho } _{(y,s)} (x,t) \, d\mu _s (y) =0. \end{aligned}$$
(6.34)

We use the monotonicity formula (4.90) for the proof. Set \(\lambda := \log (s-t)\) and

$$\begin{aligned} h(s):= \int _{\Omega } \tilde{\rho } _{(y,s)} (x,t) \, d\mu _s (y). \end{aligned}$$

After the change of variable, (6.33) is equivalent to

$$\begin{aligned} \int _{-\infty } ^{\log (T-t)} h(t+e^\lambda )\, d\lambda <\infty . \end{aligned}$$
(6.35)

We fix \(\theta \in (0,1]\) in the following. Corresponding to this \(\theta \), by (6.35), there exists a decreasing sequence \(\{ \lambda _i \}_{i=1}^\infty \) such that

$$\begin{aligned} \lambda _i \downarrow -\infty , \quad \lambda _i -\lambda _{i+1}\le \theta , \quad h(t+e^{\lambda _i}) \le \theta . \end{aligned}$$
(6.36)

For arbitrary \(\lambda \in (-\infty , \lambda _1 )\), choose i such that \(\lambda \in [\lambda _i , \lambda _{i-1} )\). Then by (4.90) (with \(\varepsilon \rightarrow 0\)) applied with \(t_0=t+e^{\lambda _i} < t_1=t+e^{\lambda }\), we have

$$\begin{aligned} h(t+e^\lambda )&= \int _{\Omega } \tilde{\rho } _{(y,t+e^\lambda )} (x,t) \, d\mu _{t+e^\lambda } (y) = \int _{\Omega } \tilde{\rho } _{(y,t+2 e^\lambda )} (x,t+e^\lambda ) \, d\mu _{t+e ^\lambda } (y)\nonumber \\&\le \int _{\Omega } \tilde{\rho } _{(y,t+2 e^\lambda )} (x,t+e^{\lambda _i} ) \, d\mu _{t+e ^{\lambda _i} } (y) +o(1) \end{aligned}$$
(6.37)

where \(\lim _{\theta \rightarrow 0}o(1)=0\). On the other hand, by (6.36) we have

$$\begin{aligned} \int _{\Omega } \tilde{\rho } _{(y,t+e^{\lambda _i})} (x,t) \, d\mu _{t+e^{\lambda _i}} (y)= h(t+e^{\lambda _i}) \le \theta . \end{aligned}$$
(6.38)

By direct calculation,

$$\begin{aligned}&\int _{\Omega }\tilde{\rho }_{(y,t+2e^{\lambda })}(x,t+e^{\lambda _i})\, d\mu _{t+e^{\lambda _i}}(y) \nonumber \\&\quad \le o(1)+\int _{B_{M\sqrt{2e^{\lambda }-e^{\lambda _i}}}(y)} \tilde{\rho }_{(y,t+2e^{\lambda })}(x,t+e^{\lambda _i})\, d\mu _{t+e^{\lambda _i}}(y) \end{aligned}$$
(6.39)

where \(\lim _{M\rightarrow \infty }o(1)=0\) and the convergence does not depend on \(\theta \). For any fixed M, we have

$$\begin{aligned}&\sup _{x\in B_{M\sqrt{2e^{\lambda }-e^{\lambda _i}}}(y)} \tilde{\rho }_{(y,t+2e^{\lambda })}(x,t+e^{\lambda _i})/ \tilde{\rho }_{(y,t+e^{\lambda _i})}(x,t) \nonumber \\&\quad \le \exp \big (M^2(e^{\lambda -\lambda _i}-1)/2\big )\le 1+o(1) \end{aligned}$$
(6.40)

where \(\lim _{\theta \rightarrow 0}o(1)=0\). The inequalities (6.37)–(6.40) show that \(h(t+e^{\lambda })\) is made arbitrarily small for all \(\lambda <\lambda _1\) and prove (6.34). Finally define \(a(x,t):=\limsup _{s\downarrow t}\int _{\Omega }\tilde{\rho }_{(y,s)}(x,t)\, d\mu _s (y)\) and note that \(\Omega \times (0,T)\) may be split into two disjoint sets

$$\begin{aligned} A\cup B:=\{(x,t)\,:\, a(x,t)=0\} \cup \{(x,t)\,:\, a(x,t)>0\}. \end{aligned}$$

The claim (6.34) proved \(|\xi |(B)=0\). On the other hand, by Lemma 6.4 we have \(\mu (A)=0\). Since \(|\xi |\le \mu \) by definition, this proves \(|\xi |(\Omega \times (0,T))=0\). Since \(T>0\) is arbitrary, we have \(|\xi |(\Omega \times (0,\infty ))=0\). \(\square \)

6.3 Associated varifolds and rectifiability theorem

We have so far obtained \(\mu _t\) as a limit of Radon measures \(\{\mu _{t}^{\varepsilon _i}\}_{i=1}^{\infty }\). To prove the rectifiability of \(\mu _t\) for a.e. \(t\ge 0\), we now consider a sequence of varifolds which are naturally associated with \(\{\mu _t^{\varepsilon _i}\}_{i=1}^{\infty }\).

Definition 6.1

For \(\varphi _{\varepsilon _i}(\cdot ,t)\), we define \(V_t^{\varepsilon _i}\in \mathbf{V}_{n-1}(\Omega )\) as follows. For \(\phi \in C_c(G_{n-1}(\Omega ))\),

$$\begin{aligned} V_t^{\varepsilon _i}(\phi ):=\int _{\Omega \cap \{|\nabla \varphi _{\varepsilon _i}(x,t)|\ne 0\}} \phi \left( x,I-\frac{\nabla \varphi _{\varepsilon _i}(x,t)}{|\nabla \varphi _{\varepsilon _i}(x,t)|}\otimes \frac{\nabla \varphi _{\varepsilon _i}(x,t)}{|\nabla \varphi _{\varepsilon _i}(\cdot ,t)|}\right) \, d\mu _t^{\varepsilon _i}(x).\nonumber \\ \end{aligned}$$
(6.41)

Lemma 6.6

For \(g=(g_1,\ldots ,g_n)\in C_c^1(\Omega ;{\mathbb R}^n)\), we have

$$\begin{aligned} \delta V_t^{\varepsilon _i}(g)&= \int _{\Omega }(g\cdot \nabla \varphi _{\varepsilon _i}) \left( \varepsilon _i\Delta \varphi _{\varepsilon _i}-\frac{W^{\prime }(\varphi _{\varepsilon _i})}{\varepsilon _i}\right) \, dx\nonumber \\&\quad -\int _{\Omega \cap \{|\nabla \varphi _{\varepsilon _i}|=0\}}\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i}\, I\cdot \nabla g\, dx\nonumber \\&\quad +\int _{\Omega \cap \{|\nabla \varphi _{\varepsilon _i}|\ne 0\}} \nabla g\cdot \left( \frac{\nabla \varphi _{\varepsilon _i}}{|\nabla \varphi _{\varepsilon _i}|}\otimes \frac{\nabla \varphi _{\varepsilon _i}}{|\nabla \varphi _{\varepsilon _i}|}\right) \xi _{\varepsilon _i}\,dx. \end{aligned}$$
(6.42)

Proof

We omit i in the following. The first variation of \(V_t^{\varepsilon }\) with respect to g is

$$\begin{aligned} \delta V_t^{\varepsilon }(g)= & {} \int _{G_{n-1}(\Omega )} \nabla g(x)\cdot S\, dV_t^{\varepsilon }(x,S)\nonumber \\= & {} \int _{\Omega \cap \{|\nabla \varphi _{\varepsilon }|\ne 0\}}\nabla g \cdot \left( I-\frac{\nabla \varphi _{\varepsilon }}{|\nabla \varphi _{\varepsilon }|}\otimes \frac{\nabla \varphi _{\varepsilon }}{|\nabla \varphi _{\varepsilon }|}\right) \left( \frac{\varepsilon }{2}|\nabla \varphi _{\varepsilon }|^2+\frac{W}{\varepsilon }\right) \, dx.\nonumber \\ \end{aligned}$$
(6.43)

By repeated integration by parts, we have

$$\begin{aligned} \int _{\Omega \cap \{|\nabla \varphi _{\varepsilon }|\ne 0\}} \nabla g\cdot I\, \frac{\varepsilon }{2}|\nabla \varphi _{\varepsilon }|^2\, dx&=-\varepsilon \int _{\Omega }\sum _{j,l=1}^n g_j(\varphi _{\varepsilon })_{x_j x_l} (\varphi _{\varepsilon })_{x_l}\, dx \nonumber \\&=\varepsilon \int _{\Omega } \nabla g\cdot ( \nabla \varphi _{\varepsilon }\otimes \nabla \varphi _{\varepsilon })+(g\cdot \nabla \varphi _{\varepsilon })\Delta \varphi _{\varepsilon }\, dx.\nonumber \\ \end{aligned}$$
(6.44)

Also by integration by parts,

$$\begin{aligned} \int _{\Omega \cap \{|\nabla \varphi _{\varepsilon }|\ne 0\}} \nabla g\cdot I\, \frac{W}{\varepsilon }\, dx= & {} -\int _{\Omega \cap \{|\nabla \varphi _{\varepsilon }|= 0\}} \nabla g\cdot I\, \frac{W}{\varepsilon }\, dx\nonumber \\&-\int _{\Omega }(g\cdot \nabla \varphi _{\varepsilon })\frac{W^{\prime }}{\varepsilon }\, dx. \end{aligned}$$
(6.45)

Now substituting (6.44) and (6.45) into (6.43), we obtain (6.42). \(\square \)

Proposition 6.1

For a.e. \(t\ge 0\), \(\mu _t\) is rectifiable, and any convergent subsequence \(\{V_t^{\varepsilon _{i_j}}\}_{j=1}^{\infty }\) with

$$\begin{aligned} \liminf _{j\rightarrow \infty } \int _{\Omega } \varepsilon _{i_j}\left( \Delta \varphi _{\varepsilon _{i_j}}(x,t)-\frac{W^{\prime }(\varphi _{\varepsilon _{i_j}} (x,t))}{\varepsilon _{i_j}^2}\right) ^2\, dx<\infty \end{aligned}$$
(6.46)

converges to the unique varifold associated with \(\mu _t\).

Proof

By Theorem 6.1 and by the dominated convergence theorem, we have

$$\begin{aligned} \lim _{i\rightarrow \infty } \int _{\Omega }|\xi _{\varepsilon _{i}}(\cdot ,t)|\, dx=0. \end{aligned}$$
(6.47)

for full sequence for a.e. \(t\ge 0\). By Lemma 4.4, we see that

$$\begin{aligned} \int _0^T\int _{\Omega } \varepsilon _i \left( \Delta \varphi _{\varepsilon _i}-\frac{W^{\prime }}{\varepsilon ^2_i}\right) ^2\, dxdt\le 2E_0. \end{aligned}$$

Thus, by Fatou’s lemma, we have

$$\begin{aligned} \liminf _{i\rightarrow \infty }\int _{\Omega } \varepsilon _i \left( \Delta \varphi _{\varepsilon _i}(x,t)-\frac{W^{\prime }(\varphi _{\varepsilon _i}(x,t))}{\varepsilon ^2_i}\right) ^2\, dx<\infty \end{aligned}$$
(6.48)

for a.e. \(t\ge 0\). Suppose \(t\ge 0\) satisfies both (6.47) and (6.48). Since \(\Vert V_t^{\varepsilon _i}\Vert (\Omega ) \le \mu _t^{\varepsilon _i}(\Omega )\) is uniformly bounded in i, by the weak compactness theorem for measures, there exists a convergent subsequence \(\{V_t^{\varepsilon _{i_j}}\}_{j=1}^{\infty }\) which satisfies (6.46) and which converges to a varifold \(V_t\). Due to Proposition and (6.47), we have

$$\begin{aligned} \Vert V_t\Vert =\mu _t. \end{aligned}$$
(6.49)

Next, a standard measure theoretic argument (see for example [41, 3.2(2)]) shows

$$\begin{aligned} \mu _t\left( \left\{ x\in \mathrm{spt}\, \mu _t\,:\, \limsup _{r\downarrow 0}\frac{\mu _t(B_r(x))}{\omega _{n-1}r^{n-1}}\le s\right\} \right) \le 2^{n-1}s{\mathcal H}^{n-1}(\mathrm{spt}\, \mu _t) \end{aligned}$$
(6.50)

for any \(s> 0\). By (6.17), \({\mathcal H}^{n-1}(\mathrm{spt}\, \mu _t)<\infty \), thus (6.50) shows

$$\begin{aligned} \mu _t(\{x\in \mathrm{spt}\, \mu _t\,:\, \lim _{r\downarrow 0}r^{1-n}\mu _t(B_r(x))=0\})=0. \end{aligned}$$
(6.51)

The two equalities (6.49) and (6.51) show that

$$\begin{aligned} V_t=V_t\lfloor _{\{x\in \Omega \, :\, \limsup _{r \downarrow 0}r^{1-n} \Vert V_t\Vert (B_r(x))>0\}\times \mathbf{G}(n,n-1)}. \end{aligned}$$
(6.52)

Next we use (6.42). For any fixed \(g\in C^1_c(\Omega ;{\mathbb R}^n)\), (6.47) shows that the limits of the last two terms of (6.42) are both 0. Thus we have

$$\begin{aligned} \lim _{j\rightarrow \infty } |\delta V_t^{\varepsilon _{i_j}}(g)|&\le \liminf _{j\rightarrow \infty } \left( \int _{\Omega }\varepsilon _{i_j} |\nabla \varphi _{\varepsilon _{i_j}}|^2\, dx\right) ^{1/2}\nonumber \\&\quad \times \left( \int _{\Omega } \varepsilon _{i_j} \left( \Delta \varphi _{\varepsilon _{i_j}}-\frac{W^{\prime }}{\varepsilon _{i_j}^2}\right) ^2\, dx\right) ^{1/2} \end{aligned}$$
(6.53)

for g with \(\sup \, |g|\le 1\). Since the right-hand side of (6.53) does not depend on g and since \(\delta V_t^{\varepsilon _{i_j}}(g)\rightarrow \delta V_t(g)\), we have

$$\begin{aligned} \sup _{g\in C^1_c(\Omega ;{\mathbb R}^n),\ \sup |g|\le 1} |\delta V_t(g)|<\infty \end{aligned}$$

which shows that the total variation \(\Vert \delta V_t\Vert \) is a Radon measure. Allard’s rectifiability theorem [1] shows that the right-hand side of (6.52) is rectifiable, and hence so is \(V_t\). Once we know that \(V_t\) is rectifiable, \(V_t\) is determined uniquely by \(\Vert V_t\Vert =\mu _t\). In particular, this shows that \(\mu _t\) is rectifiable. The argument up to this point is valid for any convergent subsequence with (6.46) and (6.47). On the other hand, note that \(\mu _t\) does not depend on the choice of subsequence \(\{V_t^{\varepsilon _{i_j}}\}_{j=1}^{\infty }\). Since \(\mu _t\) determines \(V_t\) uniquely, any converging subsequence of \(\{V_t^{\varepsilon _i}\}_{i=1}^{\infty }\) with (6.46) and (6.47) has the same limit \(V_t\). This completes the proof. \(\square \)

7 Integrality of limit measures

In this section we prove that the density function of \(\mu _t\) is integer-valued \(\mu _t\) a.e. modulo division by \(\sigma \).

7.1 Separating sheets

We prove in this subsection that, if a set of appropriate quantities are controlled, then we have a lower bound on a measure in terms of a sum of densities of vertically aligned points. As the name of the present subsection indicates, what one carries out in essence is to decompose the domain horizontally so that each separated domain contains approximately one sheet of diffused interface. The original idea comes from [1] and it has been first used in the context of the diffused interface problem in [27].

Lemma 7.1

Suppose

  1. (1)

    \(N\in {\mathbb N}\), Y is a finite subset of \(\mathbb {R}^n , 0<R<\infty , 1<M<\infty , 0<a<\infty , 0<\varepsilon <1, 0<\varrho <\infty ,0<E_0 <\infty \) and \(-\infty \le l_1 <l_2 \le \infty \).

  2. (2)

    Y has no more than \(N+1\) elements, and \(Y\subset \{(0,\ldots ,0,x_n)\,:\, l_1+a<x_n<l_2-a\}\). Moreover \(|x-z|>3a\) for \(x,z \in Y \) with \( x\not = z\).

  3. (3)

    \((M+1)\mathrm{diam}\mathrm{Y}<\mathrm{R}\), and put \(\tilde{R} := M \mathrm{diam}\mathrm{Y}\).

  4. (4)

    We have \(\varphi \in C^2(\{ y\in \mathbb {R}^n \, : \, \mathrm{dist}(\mathrm{y},\mathrm{Y})<\mathrm{R} \})\).

  5. (5)

    For all \(x=(0,\ldots ,0,x_n)\in Y\),

    $$\begin{aligned} \int _a ^R \frac{d\tau }{\tau ^n}\int _{B_\tau (x) \cap \{ y_n =l_j \}} |e_\varepsilon (y_n -x_n)-\varepsilon \varphi _{x_n} (y-x) \cdot \nabla \varphi | \, d\mathcal {H} ^{n-1} (y) \le \varrho \nonumber \\ \end{aligned}$$
    (7.1)

    for \(j=1,2\), where \(e_{\varepsilon }\) is defined as in (4.16).

  6. (6)

    For all \(x\in Y\) and \(a\le r \le R\),

    $$\begin{aligned} \int _{B_r (x)} |\xi _\varepsilon | +(1-(\nu _n)^2) \varepsilon |\nabla \varphi |^2 +\varepsilon |\nabla \varphi |\left| \Delta \varphi -\frac{W^{\prime }(\varphi )}{\varepsilon ^2} \right| \, dy \le \varrho r^{n-1},\nonumber \\ \end{aligned}$$
    (7.2)

    where \(\xi _{\varepsilon }\) is defined as in (4.16) and \(\nu =(\nu _1,\ldots ,\nu _n)=\frac{\nabla \varphi }{|\nabla \varphi |}\).

  7. (7)

    For all \(x\in Y\),

    $$\begin{aligned} \int _a^R \frac{d\tau }{\tau ^n}\int _{B_{\tau }(x)}(\xi _{\varepsilon })_+\, dy\le \varrho . \end{aligned}$$
    (7.3)
  8. (8)

    For all \(x\in Y\) and \(a\le r\le R\),

    $$\begin{aligned} \int _{B_r (x)} \varepsilon |\nabla \varphi |^2 \, dy\le E_0 r^{n-1}. \end{aligned}$$
    (7.4)

Then we have the following:

  1. (A)

    With \(S:=\{x\,:\, l_1<x_n<l_2\}\) and for all \(x\in Y\) and \(a\le r<R\),

    $$\begin{aligned} \frac{1}{r^{n-1}}\int _{B_r(x)\cap S} e_{\varepsilon }\le \frac{1}{R^{n-1}}\int _{B_R(x)\cap S} e_{\varepsilon }+\varrho (3+R). \end{aligned}$$
    (7.5)
  2. (B)

    There exists \(l_3 \in (l_1,l_2)\) such that \(|x_n -l_3|\ge a\) and

    $$\begin{aligned}&\int _a ^{\tilde{R}} \frac{d\tau }{\tau ^n} \int _{B_\tau (x)\cap \{ y_n =l_3 \} } | e_\varepsilon (y_n -x_n )-\varepsilon \varphi _{x_n} (y-x) \cdot \nabla \varphi | \, d\mathcal {H}^{n-1} (y) \nonumber \\&\quad \le 3(N+1)NM \left( \varrho + E_0 ^{\frac{1}{2}} \varrho ^{\frac{1}{2}}\right) \end{aligned}$$
    (7.6)

    for any \(x=(0,\cdot ,0,x_n)\in Y\).

  3. (C)

    Put

    $$\begin{aligned}&\displaystyle Y_1 :=Y \cap \{ x \, : \, l_1 < x_n < l_3 \}, \quad Y_2 :=Y \cap \{ x \, : \, l_3 < x_n < l_2 \},\\&\displaystyle S_0 :=\{ x \, : \, l_1 < x_n < l_2 \ \text {and} \ {\mathrm{dist}} (Y,x) <R \}, \\&\displaystyle S_1 :=\{ x \, : \, l_1 < x_n < l_3 \ \text {and} \ {\mathrm{dist}} (Y_1,x) <{\tilde{R}} \},\\&\displaystyle S_2 :=\{ x \, : \, l_3 < x_n < l_2 \ \text {and} \ {\mathrm{dist}}(Y_2,x) < \tilde{R} \}. \end{aligned}$$

    Then \(Y_1\) and \(Y_2\) are non-empty,

    $$\begin{aligned} \mathrm{diam} Y_j\le \frac{N-1}{N}\mathrm{diam} Y\quad \text{ for } j=1,2 \end{aligned}$$
    (7.7)

    and

    $$\begin{aligned} \frac{1}{\tilde{R} ^{n-1}} \left( \int _{S_1}e_\varepsilon + \int _{S_2}e_\varepsilon \right) \le \left( 1+\frac{1}{M} \right) ^{n-1} \left\{ \frac{1}{R^{n-1}} \int _{S_0} e_\varepsilon +\varrho (3+R) \right\} .\nonumber \\ \end{aligned}$$
    (7.8)

Proof

For any \(x\in Y\), after a parallel translation, assume without loss of generality that \(x=0\) for the proof of \(\text {(A)}\). Let \(\zeta _1 (y)\) be a smooth approximation of the characteristic function \(\chi _{B_r (0)} \), where \(a\le r<R\). Let \(\zeta _2 (y)\) be a smooth approximation to the characteristic function of S which depends only on \(y_n\). Let us denote

$$\begin{aligned} h_{\varepsilon }:=\Delta \varphi -\frac{W^{\prime }(\varphi )}{\varepsilon ^2}. \end{aligned}$$
(7.9)

Multiply (7.9) by \((y\cdot \nabla \varphi ) \zeta _1 \zeta _2\). After integration by parts twice (as in the computation for (6.42)) and letting \(\zeta _1 \rightarrow \chi _{B_r (0)}\), we obtain

$$\begin{aligned} \frac{d}{dr} \left\{ \frac{1}{r^{n-1}}\int _{B_r} e_\varepsilon \zeta _2 \right\}&+\frac{1}{r^n} \int _{B_r} (\xi _\varepsilon + \varepsilon h_{\varepsilon }(y \cdot \nabla \varphi ))\zeta _2 -\frac{ \varepsilon }{r^{n+1}}\int _{\partial B_r} (y \cdot \nabla \varphi )^2 \zeta _2 \nonumber \\&-\frac{1}{r^n}\int _{B_r} \{ e_\varepsilon y_n -\varepsilon \varphi _{x_n} (y \cdot \nabla \varphi ) \}\zeta ^{\prime } _2 =0. \end{aligned}$$
(7.10)

We estimate the integral over [rR] (\(r\ge a\)) of the second term in (7.10) first. We let \(\zeta _2\rightarrow \chi _{S}\) and compute

$$\begin{aligned}&\int _r ^R \frac{d\tau }{\tau ^n} \int _{B_\tau \cap S} (\xi _\varepsilon +\varepsilon h_{\varepsilon }(y\cdot \nabla \varphi )) \le \int _r ^R \frac{d\tau }{\tau ^n}\left( \int _{B_\tau }(\xi _\varepsilon )_{+} \right) \nonumber \\&\quad + \int _r ^R \frac{d\tau }{\tau ^{n-1}} \left( \int _{B_\tau } \varepsilon |h_{\varepsilon }| |\nabla \varphi | \right) \le (1+R)\varrho \end{aligned}$$
(7.11)

where (7.2) and (7.3) are used. From (7.10), (7.11) and (7.1), we obtain (7.5), proving \(\text {(A)}\). Next, choose \(\tilde{y},\tilde{z} \in Y\) such that \(\tilde{z}_n -\tilde{y}_n \ge \frac{{\mathrm{diam}\mathrm{Y}}}{N} \) and \(Y\cap \{ x \, : \, \tilde{y}_n < x_n < \tilde{z}_n \}=\emptyset \). Let \(\tilde{l}_1 = \tilde{y}_n +\frac{\tilde{z}_n -\tilde{y}_n}{3}\) and \(\tilde{l}_2 = \tilde{z}_n -\frac{\tilde{z}_n -\tilde{y}_n}{3}\). To choose an appropriate \(l_3\in (\tilde{l}_1, \tilde{l}_2)\) which satisfies (7.6), we first observe, for \(x\in Y\) and \(y\in B_r (x)\),

$$\begin{aligned} I&:=|e_\varepsilon (y_n -x_n ) -\varepsilon \varphi _{x_n} (y-x) \cdot \nabla \varphi | \nonumber \\&=|(-\xi _\varepsilon ) (y_n-x_n) +\varepsilon |\nabla \varphi |^2 ((y_n-x_n)-\nu _n(y-x)\cdot \nu ) | \nonumber \\&\le |\xi _\varepsilon |r + \varepsilon |\nabla \varphi |^2 r\Big ( 1-(\nu _n)^2 + \sqrt{1-(\nu _n)^2} \Big ). \end{aligned}$$
(7.12)

Thus by Fubini’s theorem (7.12), (7.2) and (7.4) we obtain

$$\begin{aligned}&\int _{\tilde{l}_1} ^{\tilde{l}_2} \, dl \int _a^{\tilde{R}} \frac{d\tau }{\tau ^n} \int _{B_\tau (x)\cap \{ y_n =l \}} I \, d\mathcal {H}^{n-1}\nonumber \\&\quad =\int _{a}^{\tilde{R}} \frac{d\tau }{\tau ^n} \int _{B_\tau (x) \cap \{ \tilde{l}_1 <y_n< \tilde{l} _2 \}} I \, dy \le \tilde{R} \left( \varrho + E_0^{\frac{1}{2}} \varrho ^{\frac{1}{2}}\right) . \end{aligned}$$
(7.13)

The inequality (7.13) is satisfied for each \(x\in Y\), hence we guarantee that there exists \(l_3 \in (\tilde{l}_1 , \tilde{l}_2)\) such that

$$\begin{aligned} \int _a ^{\tilde{R}} \frac{d\tau }{\tau ^n} \int _{B_\tau (x)\cap \{ y_n =l_3 \}} I \, d\mathcal {H}^{n-1}(y) \le \frac{(N+1)\tilde{R} \left( \varrho + E_0^{\frac{1}{2}} \varrho ^{\frac{1}{2}}\right) }{\tilde{l}_2 - \tilde{l}_1} \end{aligned}$$

for each \(x\in Y\). Since \(\tilde{l}_2 - \tilde{l} _1 \ge \frac{{\mathrm{diam}\mathrm{Y}}}{3N}\), we have \(\frac{\tilde{R}}{\tilde{l}_2 - \tilde{l}_1} \le 3MN\), and we obtain \(\text {(B)}\). We have \(S_1 \cup S_2\subset B_{(\tilde{R} +\mathrm{diam}\mathrm{Y})} (x) \cap S\) for \(x\in Y\) and \(S_1\cap S_2=\emptyset \). Thus, using also (3) and (7.5) with \(r=\tilde{R}+\mathrm{diam}\mathrm{Y}<\mathrm{R}\), we have

$$\begin{aligned} \frac{1}{\tilde{R} ^{n-1}} \left( \int _{S_1} e_\varepsilon + \int _{S_2} e_\varepsilon \right)&\le \frac{1}{\tilde{R} ^{n-1}} \int _{ B_{(\tilde{R} +\mathrm{diam}\mathrm{Y})} (x) \cap S} e_\varepsilon \nonumber \\&\le \left( 1+ \frac{1}{M} \right) ^{n-1} \left\{ \frac{1}{R^{n-1}}\int _{B_R (x) \cap S} e_\varepsilon +\varrho (3+R) \right\} .\nonumber \\ \end{aligned}$$
(7.14)

Since \(B_R (x) \cap S \subset S_0\), we obtain (7.8). One can check that \(\tilde{z}_n-\tilde{y}_n\ge \frac{{\mathrm{diam}\mathrm{Y}}}{N}\) implies (7.7). This proves \(\text {(C)}\). \(\square \)

Proposition 7.1

Corresponding to \(0<R<\infty , \ 0<E_0<\infty , \ 0<s<1\) and \(N\in {\mathbb N}\), there exists \(0<\varrho <1\) with the following property: Assume \(Y\subset \mathbb {R}^n\) has no more than \(N+1\) elements and \(Y\subset \{(0,\ldots ,0,x_n)\,:\, x_n\in {\mathbb R}\}\). For some \(0<a<R\) and for all \(y,\, z\in Y\) with \(y\ne z\), we have \(|y-z|>3a\) and \(\mathrm{diam}\mathrm{Y} \le \varrho \mathrm{R}\). In addition we assume (4), (6), (7), (8) of Lemma 7.1. Then we have

$$\begin{aligned} \sum _{x\in Y} \frac{1}{a^{n-1}} \int _{B_a (x)} e_\varepsilon \le s + \frac{1+s}{R^{n-1}} \int _{ \{ x \, : \, \mathrm{dist}(\mathrm{Y},\mathrm{x})<\mathrm{R} \} } e_\varepsilon . \end{aligned}$$
(7.15)

Proof

Denote the number of elements in Y by \(\# Y\). If \(\# Y=1\), the proof leading to the conclusion \(\text {(A)}\) of Lemma 7.1 (with \(l_1=-\infty \) and \(l_2=+\infty \)) gives (7.15) if \(\varrho (1+R)<s\). Note that M is irrelevant in this case since \(\mathrm{diam}\mathrm{Y}=0\). If \(1<\# Y\le N+1\), we use Lemma 7.1 inductively. First, we choose \(M>1\) depending only on \(s,\, n,\,N\) so that

$$\begin{aligned} \Big (1+\frac{1}{M}\Big )^{(n-1)N}<1+s\quad \text{ and } \quad \frac{N-1}{N}<\frac{M}{M+1}. \end{aligned}$$
(7.16)

Suppose \((M+1)\mathrm{diam}\mathrm{Y}<\mathrm{R}\). Then all the assumptions of Lemma 7.1 are satisfied, and we obtain \(Y_1\) and \(Y_2\) with the estimates. We apply Lemma 7.1 again to both \(Y_1\) and \(Y_2\) with R there replaced by \({\tilde{R}}=M\mathrm{diam}\mathrm{Y}\). Due to (7.7) and (7.16), we have the assumption (3) satisfied:

$$\begin{aligned} (M+1)\mathrm{diam}\mathrm{Y}_\mathrm{j}<\mathrm{M}\mathrm{diam}\mathrm{Y} \end{aligned}$$

for \(j=1,2\). We have (7.1) with the right-hand side given by the right-hand side of (7.6). For each \(j=1,2\), if \(\# Y_j=1\), then we obtain (7.5) with \(r=a\). Otherwise, we separate \(Y_j\) into two non-empty sets. Each time, all the assumptions of Lemma 7.1 are satisfied. Thus, after \((\# Y -1)\)-times, we separate \({\mathbb R}^n\) into \(\# Y\) disjoint horizontal stacks, each having one element of Y. With (7.16), (7.8) and (7.5), we may choose a sufficiently small \(\varrho \) depending only on \(s,\, n,\, N, \, R,\, E_0\) so that (7.15) holds. \(\square \)

7.2 The \(\varepsilon \)-scale estimate

Next proposition is almost identical to [27] and [45]. It shows that the energy behaves more or less like a 1-D simple ODE solution if certain quantities are controlled.

Proposition 7.2

Given \(0<s, \, b,\, \beta <1\), and \(1<c<\infty \), there exist and \(1<L<\infty \) (which also depend on n and W) with the following property:

Assume , \(\varphi \in C^2(B_{4\varepsilon L})\) and

$$\begin{aligned} \sup _{B_{4\varepsilon L}} \varepsilon |\nabla \varphi |\le c,\,\,\sup _{x,y\in B_{4\varepsilon L}} \varepsilon ^{\frac{3}{2}}\frac{ |\nabla \varphi (x)-\nabla \varphi (y)|}{|x-y|^{\frac{1}{2}}}\le c, \,\, |\varphi (0)|<1-b, \end{aligned}$$
(7.17)
$$\begin{aligned} \int _{B_{4\varepsilon L} } ( |\xi _\varepsilon | +(1-(\nu _n)^2) \varepsilon |\nabla \varphi |^2 )\,dx \le \varrho (4\varepsilon L)^{n-1} \end{aligned}$$
(7.18)

and

$$\begin{aligned} \sup _{B_{4\varepsilon L}} (\xi _\varepsilon )_{+} \le \varepsilon ^{-\beta }, \end{aligned}$$
(7.19)

where \(\nu \) and \(\xi _{\varepsilon }\) are as in (7.2) and (4.16). Then for \(J:= B_{3\varepsilon L}\cap \{x=(0,\ldots ,0,x_n)\}\),

$$\begin{aligned} \inf _{x\in J}\partial _{x_n}\varphi (x)>0, \ \ (\mathrm{or }~\sup _{x\in J} \partial _{x_n}\varphi (x)<0), \, \mathrm{and }~[-1+b,1-b]\subset \varphi (J).\qquad \end{aligned}$$
(7.20)

We also have

$$\begin{aligned} \left| \sigma - \frac{1}{\omega _{n-1} (L\varepsilon )^{n-1}} \int _{B_{\varepsilon L} } e_\varepsilon \right| \le s. \end{aligned}$$
(7.21)

Proof

Rescale the domain by \(x\mapsto \frac{x}{\varepsilon }\). The rescaled function defined on \(B_{4L}\) is denoted by \(\tilde{\varphi }\). Let \(\Psi :\mathbb {R} \rightarrow (-1,1)\) be the unique solution of the ODE

$$\begin{aligned} \left\{ \begin{array}{ll} \Psi ^{\prime }(t)=\sqrt{2W(\Psi (t))} &{} \text {for} \ t\in \mathbb {R}, \\ \Psi (0)=\tilde{\varphi }(0). &{} \\ \end{array} \right. \end{aligned}$$
(7.22)

We have

$$\begin{aligned} \int _{\mathbb {R}} \frac{1}{2} |\Psi ^{\prime }(t)|^2 \, dt =\int _\mathbb {R} \sqrt{\frac{W(\Psi (t))}{2} } \Psi ^{\prime }(t) \, dt = \int _{-1} ^1 \sqrt{\frac{W(s)}{2} } \, ds =\frac{\sigma }{2}.\qquad \end{aligned}$$
(7.23)

Define \(\hat{\Psi }(x)=\hat{\Psi }(x_1,x_2,\ldots , x_{n}):= \Psi (x_n)\) for \(x\in \mathbb {R}^n\). Using (7.23), it is not difficult to check that \(\lim _{L\rightarrow \infty }\frac{1}{\omega _{n-1}L^{n-1}} \int _{B_L}\left( \frac{|\nabla \hat{\Psi }|^2}{2}+W(\hat{\Psi }) \right) =\sigma \). Thus depending only on nsbW, we may choose a sufficiently large \(L>0\) such that

$$\begin{aligned} \Big |\sigma - \frac{1}{\omega _{n-1} L^{n-1}} \int _{B_L } \left( \frac{|\nabla \hat{\Psi }|^2}{2}+W(\hat{\Psi }) \right) \Big |\le \frac{s}{2} \end{aligned}$$
(7.24)

whenever \(|\hat{\Psi }(0)|=|\tilde{\varphi }(0)|\le 1-b\). After fixing such L, we next observe that, for a constant \(\tilde{c} =\tilde{c}(W)\),

$$\begin{aligned} \frac{|\nabla \tilde{\varphi }|^2}{2}-\tilde{c} (1\pm \tilde{\varphi })^2\le \frac{|\nabla \tilde{\varphi }|^2}{2}-W(\tilde{\varphi }) = \varepsilon (\xi _{\varepsilon })_+ \le \varepsilon ^{1-\beta } \quad \text {on} \ B_{4L} \end{aligned}$$
(7.25)

by (7.19). Some simple ODE argument combined with (7.25) shows that there exist \(0<\tilde{b}<b\) and depending only on \(b, \beta , L, W\) such that, whenever \(|\tilde{\varphi }(0)| \le 1-b\) and , we have \(|\tilde{\varphi }|\le 1-\tilde{b}\) on \(B_{4L}\).

Next, we define \(z:B_{4L}\rightarrow \mathbb {R}\) by \(z(x) = \Psi ^{-1} (\tilde{\varphi }(x))\), where \(\Psi ^{-1}\) is the inverse function of \(\Psi \). By \(\Psi ^{\prime } >0\) and \(|\tilde{\varphi }|\le 1-\tilde{b}\), \(\Psi ^{-1}\) and z are well-defined and

$$\begin{aligned} \Psi ^{\prime }(z(x))\ge \min _{|\tilde{\varphi }|\le 1-\tilde{b}} \sqrt{2W(\tilde{\varphi })} \end{aligned}$$
(7.26)

for \(x\in B_{4L}\). By (7.17), we have \(\Vert \tilde{\varphi }\Vert _{C^{1,\frac{1}{2}}(B_{4L})}\le 2c\). Since \(\Vert \Psi ^{-1}\Vert _{C^2(\{t\,:\, |t|\le 1-\tilde{b}\})}\) is bounded depending only on \(b, \beta , L, W\) due to (7.26), we have

$$\begin{aligned} \Vert z\Vert _{C^{1,\frac{1}{2}}(B_{4L})}\le C(b,\beta ,L,W,c). \end{aligned}$$
(7.27)

We next note that \(\tilde{\varphi }=\Psi \circ z\) and (7.22) give

$$\begin{aligned} \frac{|\nabla \tilde{\varphi }|^2 }{2}-W(\tilde{\varphi })&=\frac{1}{2} (\Psi ^{\prime } (z))^2 (|\nabla z|^2 -1), \nonumber \\ |\nabla \tilde{\varphi }|^2 (1-(\nu _n )^2 )&= (\Psi ^{\prime } (z))^2 (|\nabla z|^2 -(\partial _{x_n} z)^2 ) . \end{aligned}$$
(7.28)

After rescaling (7.18) and using (7.26) and (7.28), we obtain

$$\begin{aligned} \int _{B_{4L}}(||\nabla z|^2-1|+|\nabla z|^2-(\partial _{x_n} z)^2)\, dx \le \max _{|t|\le 1-\tilde{b}}W(t)^{-1} \varrho (4L)^{n-1}. \end{aligned}$$
(7.29)

For a non-negative function \(f\in C^{\frac{1}{2}}(B_{4L})\), suppose \(\max _{\bar{B}_{3L}} f=f(\hat{x})>0\) for \(\hat{x}\in \bar{B}_{3L}\). Then it is easy to check that \(f(x)\ge f(\hat{x})/2\) as long as \(|x-\hat{x}|\le (f(\hat{x}))^2/(2\Vert f\Vert _{C^{\frac{1}{2}}(B_{4L})})^2=:r\). Then we have

$$\begin{aligned} f(\hat{x})\le \frac{1}{\omega _n r^n}\int _{B_r(\hat{x})}2f\, dx \le \frac{2\cdot 4^n \Vert f\Vert _{C^{\frac{1}{2}}(B_{4L})}^{2n}}{\omega _n (f(\hat{x}))^{2n}} \int _{B_{4L}}f\, dx \end{aligned}$$

and thus we obtain

$$\begin{aligned} (\max _{\bar{B}_{3L}}f)^{2n+1}\le 2\cdot 4^n \Vert f\Vert ^{2n}_{C^{\frac{1}{2}} (B_{4L})} \int _{B_{4L}}f\, dx. \end{aligned}$$
(7.30)

By (7.27), (7.29) and (7.30), we have

$$\begin{aligned} \max _{\bar{B}_{3L}}(||\nabla z|^2-1|+|\nabla z|^2-(\partial _{x_n}z)^2)\le C(b,\beta ,L,W,c)\varrho ^{\frac{1}{2n+1}}. \end{aligned}$$
(7.31)

Since \(\Psi (0)=\tilde{\varphi }(0)=\Psi (z(0))\), we have \(z(0)=0\). Note that (7.31) for sufficiently small \(\varrho \) shows that \(\nabla z\approx (0,\ldots ,0,\pm 1)\) uniformly on \(B_{3L}\). This shows that \(z\approx x_n\) or \(-x_n\) in \(C^1(B_{3L})\) when \(\varrho \) is small, and in particular, we have (7.20). For the former case, we have \(\tilde{\varphi }(x)=\Psi (z(x))\approx \Psi (x_n) =\hat{\Psi }(x)\), and (7.24) gives (7.21) for sufficiently small \(\varrho \) with the right dependence. In the case of \(-x_n\), we simply note that changing \(\hat{\Psi }\) to \(\Psi (-x_n)\) does not affect the proof. \(\square \)

7.3 Estimate on \(\{ |\varphi _{\varepsilon }| \ge 1-b \}\)

We need to show some uniform smallness of energy on \(\{|\varphi _{\varepsilon }|\ge 1-b\}\) for the final step of this section.

Lemma 7.2

Suppose \(\varphi _{\varepsilon }\) and \(u_{\varepsilon }\) are the solutions for (4.2) constructed in Sect. 5. Given \(0<\delta <T\), there exist and depending only on with the following property. Suppose for \((x_0,t_0)\in \Omega \times (\delta ,T)\) and \(0<\lambda \le 2/3\),

$$\begin{aligned} \varphi _{\varepsilon } (x_0,t_0)<1-\varepsilon ^\lambda \ \ \ (\text{ or } \ \varphi _{\varepsilon } (x_0,t_0)>-1+\varepsilon ^\lambda ), \end{aligned}$$
(7.32)

where \(\lambda \) additionally satisfies

(7.33)

Then

$$\begin{aligned} \inf _{B_{\varepsilon \tilde{r}}(x_0)\times (t_0 -\varepsilon ^2 \tilde{r} ^2,t_0)}\varphi _{\varepsilon } <\alpha \quad \left( \text {resp.} \ \sup _{B_{\varepsilon \tilde{r}}(x_0)\times (t_0 -\varepsilon ^2 \tilde{r} ^2,t_0)}\varphi _{\varepsilon } >-\alpha \right) \end{aligned}$$

if .

Proof

First note that \(B_{\varepsilon \tilde{r}}(x_0)\times (t_0-\varepsilon ^2 {\tilde{r}}^2,t_0) \subset \Omega \times (0,T)\) due to (7.33). Rescale the domain by \(x \mapsto \frac{x-x_0}{\varepsilon }\) and \(t\mapsto \frac{t-t_0}{\varepsilon ^2}\), so that we are concerned with the domain \(B_{\tilde{r}}\times (-{\tilde{r}}^2,0)\). Let \(\tilde{\varphi }_{\varepsilon } (x,t):=\varphi _{\varepsilon } (\varepsilon x+x_0,\varepsilon ^2 t+t_0)\) and \(\tilde{u}_{\varepsilon } (x,t):=u_{\varepsilon } (\varepsilon x+x_0,\varepsilon ^2 t+t_0)\). As a comparison function, we need a function \(\psi \) with the following property

(7.34)

for some . To find such a function, solve \(\Delta \tilde{\psi }=\kappa \tilde{\psi }/ 4\) with \(\tilde{\psi }(0)=1\) on \({\mathbb R}^n\) among radially symmetric functions. One can show that \(\tilde{\psi }\) grows exponentially as \(|x|\rightarrow \infty \) and \(\tilde{\psi }\) achieves its minimum at the origin, thus \(\tilde{\psi }\ge 1\) on \({\mathbb R}^n\) in particular. Then set \(\psi (x,t):= e^{-\kappa t/4} \tilde{\psi }(x)\). With a suitably large depending only on n and \(\kappa \), this \(\psi \) satisfies (7.34). Next set . We choose such \(\tilde{r}\) so that

(7.35)

Under the assumption of (7.32) which is equivalent to

$$\begin{aligned} {\tilde{\varphi }}_{\varepsilon }(0,0)<1-\varepsilon ^{\lambda }, \end{aligned}$$
(7.36)

for a contradiction, assume

$$\begin{aligned} \inf _{B_{\tilde{r}}\times (-\tilde{r}^2,0)}\tilde{\varphi }_{\varepsilon } \ge \alpha . \end{aligned}$$
(7.37)

Define \(\phi _{\varepsilon } :=1-\varepsilon ^\lambda \psi \). By (7.34) we have \(\partial _t \phi _{\varepsilon } = \Delta \phi _{\varepsilon } +\frac{\kappa }{2}(1-\phi _{\varepsilon }) \) on \(\mathbb {R}^n \times (-\infty ,0)\). Furthermore, on the parabolic boundary of \(B_{\tilde{r}}\times (-{\tilde{r}}^2,0)\), by \(\tilde{r}\ge 1\) and (7.34), hence

(7.38)

where (7.35) and (7.37) are used. On the other hand \(\phi _{\varepsilon } (0,0)= 1-\varepsilon ^{\lambda }\psi (0,0)=1-\varepsilon ^\lambda >\tilde{\varphi }_{\varepsilon } (0,0)\) by (7.34) and (7.36). Hence a positive maximum value of \(\phi _{\varepsilon }-{\tilde{\varphi }}_{\varepsilon }\) is achieved at a parabolic interior point \((x^{\prime },t^{\prime }) \in B_{\tilde{r}}\times (-\tilde{r}^2,0]\). We have \(\partial _t (\phi _{\varepsilon }-{\tilde{\varphi }}_{\varepsilon }) -\Delta ( \phi _{\varepsilon }-{\tilde{\varphi }}_{\varepsilon }) \ge 0\) at \((x^{\prime },t^{\prime })\) and \(\phi _{\varepsilon }(x^{\prime },t^{\prime })>{\tilde{\varphi }}_{\varepsilon } (x^{\prime },t^{\prime })\). The latter inequality combined with (7.37) and (3.3) implies \(W^{\prime }({\tilde{\varphi }}_{\varepsilon })<W^{\prime }(\phi _{\varepsilon })\). By substituting the equations satisfied by \(\phi _{\varepsilon }\) and \({\tilde{\varphi }}_{\varepsilon }\) into the former inequality, we obtain

$$\begin{aligned} 0&\le \frac{\kappa }{2}(1-\phi _{\varepsilon })+\varepsilon {\tilde{u}}_{\varepsilon }\cdot \nabla {\tilde{\varphi }}_{\varepsilon }+W^{\prime }({\tilde{\varphi }}_{\varepsilon }) < \frac{\kappa }{2}(1-\phi _{\varepsilon })+\varepsilon ^{\frac{3}{4}}\Vert \nabla {\tilde{\varphi }}_{\varepsilon }\Vert _{L^{\infty }}+W^{\prime }(\phi _{\varepsilon }) \\&\le -\frac{\kappa }{2}(1-\phi _{\varepsilon })+\varepsilon ^{\frac{3}{4}}\Vert \nabla {\tilde{\varphi }}_{\varepsilon }\Vert _{L^{\infty }} \le -\frac{\kappa }{2} \varepsilon ^{\lambda }+\varepsilon ^{\frac{3}{4}}\Vert \nabla {\tilde{\varphi }}_{\varepsilon }\Vert _{L^{\infty }}, \end{aligned}$$

where \(W^{\prime }(\phi _{\varepsilon })\le -\kappa (1-\phi _{\varepsilon })\) follows from (7.37) and (3.3) and \(|{\tilde{u}}_{\varepsilon }|\le \varepsilon ^{-\beta }=\varepsilon ^{-\frac{1}{4}}\) by (5.7) and (5.5). We also used \(\psi \ge {\tilde{\psi }}\ge 1\) in the last inequality. Since \(\Vert \nabla {\tilde{\varphi }}_{\varepsilon }\Vert _{L^{\infty }}\) is bounded uniformly in \(\varepsilon \) (see Lemma 4.1) and \(\lambda \le 2/3<3/4\), for sufficiently small \(\varepsilon \), this is a contradiction. The other case may be proved similarly. \(\square \)

Lemma 7.3

Under the assumptions of Lemma 7.2, there exist and with the following property. For \(t_0 \in (\delta ,T)\) and \(0<r<1/2\) define

$$\begin{aligned} Z_{r,t_0}:=\left\{ x\in \Omega \, :\, \inf _{B_r(x)\times (t_0-r^2,t_0)} |\varphi _{\varepsilon }|<\alpha \right\} . \end{aligned}$$
(7.39)

If , then

(7.40)

Proof

For \(x_0\in Z_{r,t_0}\), we claim that there exist positive constants and such that

(7.41)

Once (7.41) is proved, the Besicovitch covering theorem and (4.40) prove (7.40) with an appropriate choice of . To prove (7.41), for each \(x_0\in Z_{r,t_0}\) we have \((x^{\prime },t^{\prime })\in B_r(x_0)\times (t_0-r^2,t_0)\) such that \(|\varphi _{\varepsilon }(x^{\prime },t^{\prime })|<\alpha \). Just as in the proof of Lemma 4.5, we have

(7.42)

By (4.90) with \(t_1\) and \(t_0\) there replaced by \(t^{\prime }\) and \(t_0-2r^2\), and restricting r and \(\varepsilon \) appropriately depending on constants appearing in the right-hand side of (4.90), we obtain

(7.43)

The inequalities (7.42) and (7.43) show that

(7.44)

Using the estimate (4.13), we may choose a large depending only on \(D_1\) so that

(7.45)

By (7.44) and (7.45) we obtain

(7.46)

Since \({\tilde{\rho }}_{(x^{\prime },t^{\prime }+\varepsilon ^2)}(x,t_0-2r^2)\le r^{1-n}\) and , by setting to be again , we obtain (7.41). We restricted r to be small, but when r does not satisfy the restriction, we may choose large so that (7.40) holds trivially. \(\square \)

Proposition 7.3

Suppose \(\varphi _{\varepsilon }\) and \(u_{\varepsilon }\) are the solutions for (4.2) constructed in Sect. 5. Given \(0<\delta <T\) and \(0<s<1\), there exist \(0<b<1\) and such that

$$\begin{aligned} \int _{ \{ x\in \Omega \ : \ |\varphi _{\varepsilon }(x,t) |\ge 1-b \}} \frac{W(\varphi _{\varepsilon }(x,t))}{\varepsilon }\, dx\le s \end{aligned}$$
(7.47)

for all \(t\in (\delta ,T)\) if .

Proof

We restrict \(0<b\) to be small in the following independent of \(\varepsilon \). Assume that

(7.48)

Choose \(J=J( \varepsilon ,b) \in \mathbb {N}\) such that

$$\begin{aligned} \varepsilon ^{\frac{1}{2^{J+1}}} \in (b,\sqrt{b}]. \end{aligned}$$
(7.49)

Restrict \(\varepsilon \) so that and . Note that, with this choice of b and J, we have by (7.49) and (7.48) that

(7.50)

Fix \(t_0 \in (\delta ,T)\) and we define

$$\begin{aligned} A_j :=\left\{ x\in \Omega \ : \ 1-\varepsilon ^{\frac{1}{2^{j+1}}} \le |\varphi _{\varepsilon } (x,t_0)| \le 1-\varepsilon ^{\frac{1}{2^{j}}} \right\} \quad \text {for} \ j=1,\dots , J.\qquad \end{aligned}$$
(7.51)

For any point \(x_0\in A_j\), we apply Lemma 7.2 with \(\lambda =\frac{1}{2^j}\). Note that the condition (7.33) is satisfied due to (7.50). Thus setting , we obtain

$$\begin{aligned} \inf _{B_{\varepsilon \tilde{r}}(x_0)\times (t_0 -\varepsilon ^2 \tilde{r} ^2,t_0)}|\varphi _\varepsilon | <\alpha . \end{aligned}$$
(7.52)

With the notation of (7.39), (7.52) shows

(7.53)

and the application of Lemma 7.3 to (7.53) shows

(7.54)

for all \(j=1,\ldots ,J\). On \(A_j\), by \(|\varphi _\varepsilon |\ge 1-\varepsilon ^{\frac{1}{2^{j+1}}}\), we have

$$\begin{aligned} \frac{W(\varphi _\varepsilon )}{\varepsilon } \le \left( \max _{ [-1 , 1]}|W^{\prime \prime }|\right) \cdot \varepsilon ^{-1} \frac{\left( \varepsilon ^{\frac{1}{2^{j+1}}}\right) ^2}{2}\le c(W) \varepsilon ^{2^{-j}-1}. \end{aligned}$$
(7.55)

Set \(Y:=\{x\in \Omega \ :\ 1-b \le |\varphi _{\varepsilon }(x,t_0) | \le 1-\sqrt{\varepsilon } \}\). By (7.51) and (7.49), we have

$$\begin{aligned} Y \subset \cup _{j=1} ^J A_j. \end{aligned}$$
(7.56)

Combining (7.54)–(7.56) and setting ,

(7.57)

where we used the fact that \(2^{-x}\varepsilon ^{2^{-x}}\) is monotone increasing for \(x\in [1,J+1]\) as long as \(\log \sqrt{b} \le -1\), and (7.49). We restrict b so that the right-hand side of (7.57) is less than s / 2. The similar estimate shows

(7.58)

Recalling that \(|\varphi _{\varepsilon }|\le 1\), we have

$$\begin{aligned} \int _{ \{ 1-\varepsilon ^{\frac{2}{3}} \le | \varphi _{\varepsilon } |\} } \frac{W(\varphi _{\varepsilon } )}{\varepsilon } \le c(W) (\varepsilon ^{\frac{2}{3}})^2 \cdot \frac{1}{\varepsilon } \le c(W)\varepsilon ^{\frac{1}{3}}. \end{aligned}$$
(7.59)

By (7.57)-(7.59) we restrict \(\varepsilon \) depending on s so that we have (7.47). \(\square \)

7.4 Proof of integrality

Finally we prove the integrality of \(\mu _t\).

Theorem 7.1

For a.e. \(t>0\), \(\mu _t=\theta {\mathcal H}^{n-1}\lfloor _{M_t}\), where \(M_t\) is countably \((n-1)\)-rectifiable and \(\theta (x,t) = N(x,t)\sigma \) for some \({\mathcal H}^{n-1}\) measurable integer-valued function, \(\mu _t\) a.e. \(x\in \Omega \).

Proof

By the argument in the proof of Proposition 6.1, for a.e. \(t\ge 0\), we may choose a subsequence \(\{V_t^{\varepsilon _{i_j}}\}_{j=1}^{\infty }\) such that (6.47) and (with the notation of (7.9))

$$\begin{aligned} c_h(t):=\sup _{j} \int _{\Omega }\varepsilon _{i_j}|h_{\varepsilon _{i_j}}\nabla \varphi _{\varepsilon _{i_j}}|(x,t)\, dx<\infty \end{aligned}$$
(7.60)

hold while \(V_t^{\varepsilon _{i_j}}\rightarrow V_t\). Here \(V_t\) is the rectifiable varifold uniquely determined by \(\mu _t\) and recall that \(\mu _t=\Vert V_t\Vert \). In the following we fix any such t and show the claim of the theorem for \(\mu _t\). All functions are evaluated at the same t, and we do not write out the time variable (except for \(\mu _t\) and \(V_t\) with or without \({\varepsilon _i}\)) for simplicity. Moreover, though it is important to note that we are discussing a particular subsequence (or its further subsequence), we denote \(\varepsilon _{i_j}\) by \(\varepsilon _i\) for simplicity.

For any \(m\in {\mathbb N}\), we define

$$\begin{aligned} A_{i,m}:=\big \{x\in \Omega \ :\ \int _{B_r(x)}\varepsilon _i |h_{\varepsilon _i}\nabla \varphi _{\varepsilon _i}|\, dx\le m \mu _t^{\varepsilon _i}(B_r(x))\ \text{ for } \text{ all } 0<r<1/2\big \}.\nonumber \\ \end{aligned}$$
(7.61)

The Besicovitch covering theorem with (7.60) and (7.61) shows that

$$\begin{aligned} \mu _t^{\varepsilon _i}(\Omega {\setminus } A_{i,m})\le \frac{c(n) c_h(t) }{m}. \end{aligned}$$
(7.62)

We then set

$$\begin{aligned} A_m:=\{x\in \Omega \ : \ \text{ there } \text{ exist } x_i\in A_{i,m} \text{ for } \text{ infinitely } \text{ many } i \text{ with } x_i\rightarrow x\}\qquad \end{aligned}$$
(7.63)

and

$$\begin{aligned} A:=\cup _{m=1}^{\infty }A_m. \end{aligned}$$
(7.64)

We claim

$$\begin{aligned} \mu _t(\Omega {\setminus } A)=0. \end{aligned}$$
(7.65)

Otherwise, we would have a compact set \(K\subset \Omega {\setminus } A\) such that \(\mu _t(K)\ge \frac{1}{2} \mu _t(\Omega {\setminus } A)\). For any \(m\in {\mathbb N}\) we have \(K\subset \Omega {\setminus } A_m\) by (7.64). For each point \(x\in K\), by (7.63), there exists a neighborhood of x which does not intersect with \(A_{i,m}\) for all sufficiently large i. Due to the compactness, thus, there exist \(i_0\) and an open set \(O_m\) such that \(K\subset O_m\) and \(O_m\cap A_{i,m}=\emptyset \) for all \(i\ge i_0\). Let \(\phi _m\in C_c(O_m;{\mathbb R}^+)\) such that \(0\le \phi _m\le 1\) and \(\phi _m=1\) on K. Then

$$\begin{aligned} \mu _t(K) \le \int _{\Omega }\phi _m\, d\mu _t&=\lim _{i\rightarrow \infty } \int _{\Omega } \phi _m \, d\mu _t^{\varepsilon _i} = \lim _{i\rightarrow \infty } \int _{\Omega {\setminus } A_{j,m}}\phi _m\, d\mu _t^{\varepsilon _i}\nonumber \\&\le \liminf _{i\rightarrow \infty }\mu _t^{\varepsilon _i}(\Omega {\setminus } A_{j,m}) \end{aligned}$$
(7.66)

for all \(j\ge i_0\). Since the last quantity of (7.66) is less than \(c(n)c_h(t)/m\) by (7.62), and since m is arbitrary, we obtain \(\mu (K)=0\). This proves the claim (7.65).

Since \(\mu _t\) is rectifiable, \(\mu _t\) a.e. point x has an approximate tangent space. By (7.65), we may also assume that for \(\mu _t\) a.e. x there exists some \(m\in {\mathbb N}\) such that \(x\in A_m\). We fix any such point, and after a parallel translation, we may assume that \(x=0\). Furthermore, after a rotation, we may assume that the approximate tangent space is \(P:=\{x_n=0\}\). Denote \(\theta :=\lim _{r\downarrow 0}\frac{\Vert V_t\Vert (B_r(x))}{\omega _{n-1}r^{n-1}}\). We will be done if we prove that \(\sigma ^{-1}\theta \in {\mathbb N}\).

For any sequence \(r_i\downarrow 0\), we have \(\lim _{i\rightarrow \infty }(\Phi _{r_i})_{\#} V_t=\theta |P|\), where \(\Phi _{r_i}(x)=\frac{x}{r_i}\) and \((\Phi _{r_i})_{\#}\) is the usual push-forward of varifold. |P| is the unit density varifold naturally derived from P. Since \(0\in A_m\), there exists a subsequence (denoted by the same index) \(x_i\in A_{i,m}\) such that \(\lim _{i\rightarrow \infty }x_i=0\). After choosing a further subsequence, we may assume that

$$\begin{aligned} \lim _{i\rightarrow \infty }(\Phi _{r_i})_{\#} V_t^{\varepsilon _i}=\theta |P|, \end{aligned}$$
(7.67)
$$\begin{aligned} \lim _{i\rightarrow \infty }\frac{x_i}{r_i}=0 \end{aligned}$$
(7.68)

and

$$\begin{aligned} \lim _{i\rightarrow \infty }\frac{\varepsilon _i^{\beta \prime -\beta }|\log \varepsilon _i|}{r_i^{n-1}}=0. \end{aligned}$$
(7.69)

For a such choice, we also have \(\lim _{i\rightarrow \infty } \frac{\varepsilon _i}{r_i}=0\). Rescale the coordinates by \(\tilde{x}:=\frac{x}{r_i}\) and define \({\tilde{\varepsilon }}_i:=\frac{\varepsilon _i}{r_i}\rightarrow 0\). Define \(\tilde{\varphi }_{{\tilde{\varepsilon }}_i}(\tilde{x}):=\varphi _{\varepsilon _i}(r_i \tilde{x})\). We also define \({\tilde{\xi }}_{{\tilde{\varepsilon }}_i}\) and \({\tilde{h}}_{{\tilde{\varepsilon }}_i}\) as in (4.15) and (7.9) corresponding to \({\tilde{\varepsilon }}_i\) and \({\tilde{\varphi }}_{{\tilde{\varepsilon }}_i}\). Due to (6.47), we may choose a further subsequence so that

$$\begin{aligned} \lim _{i\rightarrow \infty } \int _{B_3} |{\tilde{\xi }}_{{\tilde{\varepsilon }}_i}|\, d{\tilde{x}}=0. \end{aligned}$$
(7.70)

Due to Corollary 4.1 and (7.69), for any \(y\in B_2\) and \(0<r<2\), we have

(7.71)

as \(i\rightarrow \infty \). For \({\tilde{h}}_{{\tilde{\varepsilon }}_i}\), we have

$$\begin{aligned} {\tilde{\varepsilon }}_i \int _{B_3}|{\tilde{h}}_{{\tilde{\varepsilon }}_i}\nabla {\tilde{\varphi }}_{{\tilde{\varepsilon }}_i}|\, d{\tilde{x}}&=\frac{ \varepsilon _i }{r_i^{n-2}}\int _{B_{3r_i}}|h_{\varepsilon _i} \nabla \varphi _{\varepsilon _i}|\, dx \le \frac{m}{r_i^{n-2}}\mu _t^{\varepsilon _i}(B_{4r_i}(x_i)) \nonumber \\&\le m 4^{n-1}\omega _{n-1}D_1 r_i\rightarrow 0 \end{aligned}$$
(7.72)

as \(i\rightarrow \infty \), where we used (7.68), \(x_i\in A_{i,m}\), (7.61) and (4.13). If one defines a varifold \({\tilde{V}}_t^{{\tilde{\varepsilon }}_i}\) corresponding to \({\tilde{\varphi }}_{{\tilde{\varepsilon }}_i}\) as in (6.41), then one can check that \({\tilde{V}}_t^{{\tilde{\varepsilon }}_i}=(\Phi _{r_i})_{\#} V_t^{\varepsilon _i}\). Next we claim

$$\begin{aligned} \int _{B_3} (1-(\nu _n)^2){\tilde{\varepsilon }}_i |\nabla {\tilde{\varphi }}_{{\tilde{\varepsilon }}_i}|^2\, d{\tilde{x}}\rightarrow 0 \end{aligned}$$
(7.73)

as \(i\rightarrow \infty \), where \(\nu =(\nu _1,\ldots ,\nu _n)=\frac{\nabla {\tilde{\varphi }}_{{\tilde{\varepsilon }}_i}}{|\nabla {\tilde{\varphi }}_{{\tilde{\varepsilon }}_i}|}\). Note first that \(G_{n-1}({\mathbb R}^n)\cong {\mathbb S}^{n-1}/\{\pm 1\}\) and a function defined by \(\psi \, : \, \pm \nu \in {\mathbb S}^{n-1}/\{\pm 1\}\longmapsto 1-\nu _n^2\) is continuous. Thus for any \(\phi \in C_c({\mathbb R}^n)\), we have by (7.67)

$$\begin{aligned} {\tilde{V}}_t^{{\tilde{\varepsilon }}_i}(\phi \psi )=\int \phi ({\tilde{x}})(1-(\nu _n)^2)\, d\Vert {\tilde{V}}_t^{{\tilde{\varepsilon }}_i}\Vert ({\tilde{x}}) \rightarrow \theta |P|(\phi \psi ) \end{aligned}$$
(7.74)

and since \(P=\{x_n=0\}\),

$$\begin{aligned} \theta |P|(\phi \psi )=\theta \int _{P} \phi ({\tilde{x}}) \psi ((0,\cdot ,0,\pm 1))\, d{\mathcal H}^{n-1}({\tilde{x}})=0. \end{aligned}$$
(7.75)

In particular, (7.74) and (7.75) prove (7.73). In the following we fix this subsequence and drop the tilde for simplicity.

Assume that N is the smallest positive integer greater than \(\sigma ^{-1} \theta \), that is,

$$\begin{aligned} \theta \in [(N-1)\sigma , N\sigma ). \end{aligned}$$
(7.76)

Let \(s>0\) be arbitrary. By Proposition 7.3 and (7.70), there exists \(0<b<1\) such that

$$\begin{aligned} \int _{B_3 \cap \{ |\varphi _{\varepsilon _i}|\ge 1-b \}} \left( \frac{\varepsilon _i |\nabla \varphi _{\varepsilon _i}|^2}{2} +\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i} \right) \le s \end{aligned}$$
(7.77)

for all sufficiently large i. Corresponding to s and b as well as c given by Lemma 4.1, by Proposition 7.2, we choose \(\varrho \) and L (with a restriction on \(\varepsilon _i\)). Then with \(R=2\), by Proposition 7.1, we restrict \(\varrho \) further if necessary. We use Proposition 7.1 with \(a=L\varepsilon _i\). For all large i we define

$$\begin{aligned} G_i&:= B_2 \cap \{ |\varphi _{\varepsilon _i}|\le 1-b \} \cap \left\{ x \ : \ \int _{B_r (x)} \varepsilon _i |h_{\varepsilon _i}\nabla \varphi _{\varepsilon _i}| + |\xi _{\varepsilon _i}|\right. \nonumber \\&\quad \left. + (1-(\nu _n)^2 ) \varepsilon _i |\nabla \varphi _{\varepsilon _i}|^2 \le \varrho \, \mu _{t} ^{\varepsilon _i} (B_r (x)) \ \text{ if } \ \varepsilon _i L\le r\le 1 \right\} . \end{aligned}$$
(7.78)

By the Besicovitch covering theorem, we obtain

$$\begin{aligned} \mu _{t} ^{\varepsilon _i} (B_2 \cap \{ |\varphi _{\varepsilon _i}|&\le 1-b \}{\setminus } G_i ) \le \frac{c(n)}{ \varrho } \int _{B_3} \varepsilon _i |h_{\varepsilon _i}\nabla \varphi _{\varepsilon _i}| + |\xi _{\varepsilon _i}|\nonumber \\&\quad + (1-(\nu _n)^2 ) \varepsilon _i |\nabla \varphi _{\varepsilon _i}|^2. \end{aligned}$$
(7.79)

The right hand side goes to 0 as \(i\rightarrow \infty \) by (7.72), (7.70), (7.73). Next we claim the following lower bound for all sufficiently large i:

$$\begin{aligned} \mu _t^{\varepsilon _i}(B_r(x))\ge (\sigma -2s)\omega _{n-1} r^{n-1} \end{aligned}$$
(7.80)

for all \(L\varepsilon _i\le r\le 1\) and \(x\in G_i\). To see this, first note that the assumptions of Proposition 7.2 are all satisfied due to Lemma 4.1, (7.78) and (4.26). This proves the inequality (7.80) with \(r=L\varepsilon _i\) and with 2s replaced by s. Next the identity (7.10) with \(\zeta _2\equiv 1\), (7.11), (7.71) and (7.78) shows

$$\begin{aligned} \frac{1}{\tau ^{n-1}}\mu _t^{\varepsilon _i}(B_{\tau }(x))\Big |_{\tau =L\varepsilon _i}^r&\ge o(1)-\int _{L\varepsilon _i}^r \varrho \frac{\mu _t^{\varepsilon _i}(B_{\tau }(x))}{\tau ^{n-1}}\, d\tau \nonumber \\&\ge o(1)-\omega _{n-1}D_1 \varrho \end{aligned}$$
(7.81)

after integrating over \([L\varepsilon _i,r]\). We may restrict \(\varrho \) so that \(D_1\varrho <s\). Thus (7.81) gives (7.80) for all sufficiently large i. Since \(\mu _t^{\varepsilon _i}=\Vert V_t^{\varepsilon _i}\Vert \rightarrow \theta {\mathcal H}^{n-1}\lfloor _P\), (7.80) shows that points in \(G_i\) converge uniformly to P as \(i\rightarrow \infty \).

For any \(x\in P \cap B_1\) and \(|l| \le 1-b\), we next prove

$$\begin{aligned} \# (P^{-1} (x) \cap G_i \cap \{ \varphi _{\varepsilon _i}=l \})\le N-1. \end{aligned}$$
(7.82)

If the claim were not true, we choose N elements and set it to be Y, and apply Proposition 7.1 with \(R=1\), \(\varphi =\varphi _{\varepsilon _i}\) and \(a=L\varepsilon _i\). The property \(|y-z|>3L\varepsilon _i\) holds due to (7.20), \(\mathrm{diam}\, Y\le \varrho \) due to the uniform convergence of \(G_i\) to P, (6), (7) are due respectively to (7.78) and (7.71). Thus all the assumptions of Proposition 7.1 are satisfied and we have

$$\begin{aligned} \sum _{y\in Y}\frac{1}{(L\varepsilon _i)^{n-1}}\mu _t^{\varepsilon _i}(B_{L\varepsilon _i}(y))\le s+(1+s)\mu _t^{\varepsilon _i}(\{z\,:\, \mathrm{dist}\,(Y,z)<1\}) \end{aligned}$$
(7.83)

for all sufficiently large i. Since \(\lim _{i\rightarrow \infty }\mu _t^{\varepsilon _i}(\{z\,:\, \mathrm{dist}\,(Y,z)<1\})=\theta \omega _{n-1}\), \(\#Y=N\) and (7.80), we obtain

$$\begin{aligned} N(\sigma -2s)\omega _{n-1}\le s+(1+s)\theta \omega _{n-1}. \end{aligned}$$
(7.84)

Since \(\sigma N>\theta \) by definition, (7.84) is a contradiction for sufficiently small s depending only on \(\sigma \), \(\theta \) and n. Thus we proved (7.82).

To conclude the proof, we consider push-forward of

$$\begin{aligned} {\hat{V}}_t^{\varepsilon _i}:=V_t^{\varepsilon _i}\lfloor _{\{ |x_n|\le 1\}\times \mathbf{G}(n,n-1)} \end{aligned}$$

by P, \(P_{\#} {\hat{V}}_t^{\varepsilon _i}\). For any \(\phi (x,S)\in C_c((P\cap B_2)\times \mathbf{G}(n,n-1))\), we have (for all sufficiently large i)

$$\begin{aligned} P_{\#} {\hat{V}}_t^{\varepsilon _i}(\phi )=\int _{\{|x_n|\le 1\}} \phi (P(x), P) |\Lambda _{n-1}P\circ (I-\nu \otimes \nu )|\, d\mu _t^{\varepsilon _i}. \end{aligned}$$
(7.85)

Here \(\Lambda _{n-1} A\) denotes the Jacobian of \(A\in \mathrm{Hom}({\mathbb R}^n;{\mathbb R}^n)\) ([1]). One can check that \(|\Lambda _{n-1} P\circ (I-\nu \otimes \nu )|=|\nu _n|=\frac{|\partial _{x_n}\varphi _{\varepsilon _i}|}{|\nabla \varphi _{\varepsilon _i}|}\). Due to the varifold convergence (7.67), we have \(P_{\#}{\hat{V}}_t^{\varepsilon _i}\rightarrow P_{\#}(\theta |P|)=\theta |P|\) as \(i\rightarrow \infty \). In the following we also use

$$\begin{aligned} \lim _{i\rightarrow \infty }\int _{B_3}\left| \frac{\varepsilon _i |\nabla \varphi _{\varepsilon _i}|^2}{2}+\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i}-|\nabla \varphi _{\varepsilon _i} | \sqrt{2W(\varphi _{\varepsilon _i})} \right| \, dx=0 \end{aligned}$$
(7.86)

which follows from (7.70). Now we have

$$\begin{aligned} \omega _{n-1} \theta&=\Vert \theta |P| \Vert (B_1)=\lim _{i\rightarrow \infty } \Vert P_{\#} {\hat{V}}_t^{\varepsilon _i}\Vert (B_1)=\lim _{i\rightarrow \infty } \int _{B_1 } |\nu _n|\, d\mu _t^{\varepsilon _i} \nonumber \\&\le \liminf _{i\rightarrow \infty } \int _{B_1 \cap \{ |\varphi _{\varepsilon _i} |\le 1-b \} \cap G_i} |\nu _n|\, d\mu _t^{\varepsilon _i}+2s \nonumber \\&\le \liminf _{i\rightarrow \infty } \int _{B_1\cap \{|\varphi _{\varepsilon _i} |\le 1-b\}\cap G_i}|\nu _n| |\nabla \varphi _{\varepsilon _i}|\sqrt{2W(\varphi _{\varepsilon _i})}\, dx +2s \end{aligned}$$
(7.87)

due to (7.77), (7.79) and (7.86). By the co-area formula [41, 10.6], we obtain

$$\begin{aligned}&\int _{B_1\cap \{|\varphi _{\varepsilon _i} |\le 1-b\}\cap G_i} |\nu _n||\nabla \varphi _{\varepsilon _i}|\sqrt{2W(\varphi _{\varepsilon _i})}\, dx\nonumber \\&\quad =\int _{-1+b}^{1-b}\, d\tau \int _{ \{\varphi _{\varepsilon _i}=\tau \}\cap B_1\cap G_i} |\nu _n| \sqrt{2W(\tau )}\, d{\mathcal H}^{n-1}. \end{aligned}$$
(7.88)

Then by the area formula [41, 12.4] applied to the map \(P\, :\, \{\varphi _{\varepsilon _i}=\tau \}\rightarrow \{x_n=0\}\), we have

$$\begin{aligned}&\int _{\{\varphi _{\varepsilon _i}=\tau \}\cap B_1\cap G_i} |\nu _n|\, d{\mathcal H}^{n-1} \nonumber \\&\quad =\int _{\{x_n=0\}} {\mathcal H}^0 (\{\varphi _{\varepsilon _i}=\tau \}\cap B_1\cap G_i\cap P^{-1}(x))\, d{\mathcal H}^{n-1}(x). \end{aligned}$$
(7.89)

Now the integrand of the right-hand side of (7.89) is \(\le N-1\) due to (7.82) for \(|x|\le 1\), and 0 otherwise. Combining (7.87)–(7.89), we finally obtain

$$\begin{aligned} \omega _{n-1}\theta&\le 2s+ \liminf _{i\rightarrow \infty } \omega _{n-1}(N-1)\int _{-1+b}^{1-b}\sqrt{2W(\tau )}\, d\tau \nonumber \\&\le 2s+\omega _{n-1}(N-1)\sigma . \end{aligned}$$
(7.90)

Since \(s>0\) is arbitrary, (7.90) shows \(\theta \le (N-1)\sigma \). By (7.76), we have \(\theta = (N-1)\sigma \). \(\square \)

8 Proof of the main theorem

We finally define a family of varifolds which will be a generalized solution of (1.2). To remove the multiple of \(\sigma \), we re-define \(V_t\) as follows.

Definition 8.1

For a.e. \(t\ge 0\) when \(\mu _t\) is rectifiable and integral modulo division by \(\sigma \), let \(V_t\) be the uniquely defined integral varifold by \(\sigma ^{-1}\mu _t\). For any other \(t>0\), define \(V_t\) by \(V_t(\phi ):=\sigma ^{-1}\int _{U} \phi (x,P_0)\, d\mu _t(x)\) for \(\phi \in C_c(G_{n-1}(U))\), where \(P_0\in {G}(n,n-1)\) is an arbitrary fixed element.

With this definition, we have \(\Vert V_t\Vert =\sigma ^{-1}\mu _t\) for all \(t\ge 0\), and \(V_t\in \mathbf{IV}_{n-1}(\Omega )\) for a.e. \(t\ge 0\) by Theorem 7.1. Thus (a) of Definition 2.1 is satisfied. The condition (b) is satisfied due to (4.13). Let us consider (c). The \(L^2\) integrability of u, \(\int _0^T\int _{\Omega } |u|^2\, d\Vert V_t\Vert dt<\infty \), may be proved as in (4.42) and (4.43) once (b) is established. For h, we prove the following.

Proposition 8.1

For a.e. \(t\ge 0\), \(V_t\) has a generalized mean curvature \(h(V_t)\) and we have

$$\begin{aligned} \int _\Omega \phi |h(V_t)|^2 \, d\Vert V_t\Vert \le \sigma ^{-1}\liminf _{i\rightarrow \infty } \int _\Omega \varepsilon _i \phi \left( \Delta \varphi _{\varepsilon _i} -\frac{W'(\varphi _{\varepsilon _i} )}{\varepsilon _i ^2} \right) ^2 \, dx<\infty \qquad \end{aligned}$$
(8.1)

for any \(\phi \in C_c(\Omega \,;\,{\mathbb R}^+)\).

Proof

Just as in the proof of Proposition 6.1, for a.e. \(t\ge 0\), we may assume (6.47) and (6.48) and there exists a subsequence \(\{V_t^{\varepsilon _{i_j}}\}_{j=1}^{\infty }\) converging to \(\sigma V_t\) (note that we re-defined \(V_t\)) with (6.46). By arguing as in the proof of Proposition 6.1, for any \(g\in C_c^1 (\Omega \,;\, {\mathbb R}^n)\), we have

$$\begin{aligned} |\delta V_t(g)|\le \sigma ^{-1}\left( \int _{\Omega } |g|^2\, d\mu _t\right) ^{1/2}\liminf _{j\rightarrow \infty }\left( \int _{\Omega } \varepsilon _{i_j}\left( \Delta \varphi _{\varepsilon _{i_j}}-\frac{W'}{\varepsilon _{i_j}^2}\right) ^2\, dx\right) ^{1/2}.\nonumber \\ \end{aligned}$$
(8.2)

The inequality and (6.46) show that the total variation \(\Vert \delta V_t\Vert \) of \(\delta V_t\) is absolutely continuous with respect to \(\mu _t=\sigma \Vert V_t\Vert \). Thus by the Radon-Nikodym theorem there exists a \(\Vert V_t\Vert \) measurable vector field \(h(V_t)\) (generalized mean curvature vector) such that

$$\begin{aligned} \delta V_t(g)=-\int _{\Omega } g\cdot h(V_t)\, d\Vert V_t\Vert . \end{aligned}$$
(8.3)

Since \(V_t\) is rectifiable, going back to the definition of countably \((n-1)\)-rectifiable set, one can show that \(C_c^1(\Omega )\) is dense in \(L^2(\Vert V_t\Vert )\). Then a standard approximation argument shows \(h(V_t)\in L^2(\Vert V_t\Vert )\) and (8.1) with \(\phi =1\). Next, given \(\phi \in C_c(\Omega \,;\,{\mathbb R}^+)\), let \(\psi _j\in C_c^1(\Omega \,;\,{\mathbb R}^+)\) be a sequence such that \(\lim _{k\rightarrow \infty }\Vert \phi - \psi _k\Vert _{C^0(\Omega )}=0\). Using \(\psi _k g\) in the proof of Proposition 6.1 and letting \(k\rightarrow \infty \), we obtain

$$\begin{aligned}&\left| \int _{\Omega }\phi g\cdot h(V_t)\, d\mu _t\right| \le \left( \int _{\Omega }\phi |g|^2\, d\mu _t\right) ^{1/2} \liminf _{j\rightarrow \infty }\left( \int _{\Omega } \varepsilon _{i_j}\phi \left( \Delta \varphi _{\varepsilon _{i_j}}-\frac{W'}{\varepsilon _{i_j}^2}\right) ^2\, dx\right) ^{1/2}.\nonumber \\ \end{aligned}$$
(8.4)

By approximation, we obtain (8.1) from (8.4). \(\square \)

Now Proposition 8.1 combined with Lemma 4.4 and Fatou’s lemma proves (c). For the proof of (d), one point which we need to be careful about is that we may not have the whole sequence \(\{V_t^{\varepsilon _i}\}_{i=1}^{\infty }\) converging to \(V_t\) as varifold for a.e. \(t\ge 0\) even though \(\{\Vert V_t^{\varepsilon _i}\Vert \}_{i=1}^{\infty }\) converges to \(\sigma \Vert V_t\Vert =\mu _t\) for all \(t\ge 0\).

Proposition 8.2

The family of varifolds \(\{V_t\}_{t\ge 0}\) defined in Definition 8.1 is a generalized solution of (1.2).

Proof

We prove (2.10) for \(\phi \in C_c^2 (\Omega \times [0,\infty )\,;\, {\mathbb R}^+)\). For \(\phi \in C_c^1\), one can approximate \(\phi \) by a sequence of \(C_c^2\) functions and obtain the same result in the limit. First by modifying (5.11) we obtain (with the notation (7.9))

$$\begin{aligned} \mu _{t} ^{\varepsilon _i }(\phi (\cdot ,t))\Big |_{t=t_1}^{t_2}= & {} \int _{t_1} ^{t_2} \int _\Omega -\varepsilon _i \phi h_{\varepsilon _i} ^2 -\varepsilon _i h_{\varepsilon _i}\nabla \phi \cdot \nabla \varphi _{\varepsilon _i} +\varepsilon _i \phi h_{\varepsilon _i} u_{\varepsilon _i}\cdot \nabla \varphi _{\varepsilon _i} \nonumber \\&+\, \varepsilon _i (\nabla \varphi _{\varepsilon _i} \cdot \nabla \phi ) (u_{\varepsilon _i}\cdot \nabla \varphi _{\varepsilon _i} ) \, dxdt+\int _{t_1}^{t_2} \int _{\Omega }\frac{\partial \phi }{\partial t}\, d\mu _t^{\varepsilon _i}dt.\qquad \qquad \end{aligned}$$
(8.5)

Modulo division by \(\sigma \), the left-hand side of (8.5) converges to that of (2.10) due to Proposition . The same is true for the last term of (8.5). Hence we focus on the middle 4 terms. First we approximate \(u_{\varepsilon _i}\) by a fixed smooth \({\tilde{u}}\) as follows. Given \(\epsilon >0\), we choose a large j so that \(t_2<T_j\) and

$$\begin{aligned} \Vert u-u_{\varepsilon _j}\Vert _{L^q([0,T_j];W^{1,p}(\Omega ))}<\epsilon \ \ \text{ and }\ \ \Vert u_{\varepsilon _j}-u_{\varepsilon _i}\Vert _{L^q([0,T_j];W^{1,p}(\Omega ))}<\epsilon \end{aligned}$$
(8.6)

for all \(i\ge j\). This is possible since \(u_{\varepsilon _i}\) converges to u in this norm. Set \({\tilde{u}}:=u_{\varepsilon _j}\). Then we have

$$\begin{aligned}&\left| \int _{t_1}^{t_2}\int _{\Omega } \varepsilon _i\phi h_{\varepsilon _i}(u_{\varepsilon _i}-{\tilde{u}})\cdot \nabla \varphi _{\varepsilon _i} +\varepsilon _i (\nabla \varphi _{\varepsilon _i}\cdot \nabla \phi )((u_{\varepsilon _i}-{\tilde{u}})\cdot \nabla \varphi _{\varepsilon _i})\, \right| \nonumber \\&\quad \le \left( \int _{t_1}^{t_2}\int _{\Omega } 2\varepsilon _i \left( \phi ^2 h_{\varepsilon _i}^2+|\nabla \phi |^2 |\nabla \varphi _{\varepsilon _i}|^2\right) \right) ^{1/2} \left( \int _{t_1}^{t_2}\int _{\Omega } |u_{\varepsilon _i}-{\tilde{u}}|^2\, d\mu _t^{\varepsilon _i}dt\right) ^{1/2}.\nonumber \\ \end{aligned}$$
(8.7)

As in the proof of Lemma 4.4, and by (4.13) and (8.6), we have

$$\begin{aligned} \int _{t_1}^{t_2}dt\int _{\Omega }|u_{\varepsilon _i}-{\tilde{u}}|^2\, d\mu _t^{\varepsilon _i}&\le c(n) D_1 (t_2-t_1)^{1-\frac{2}{q}}\Vert u_{\varepsilon _i}\nonumber \\&\quad -{\tilde{u}}\Vert _{ L^q([t_1,t_2];W^{1,p}(\Omega ))}^2<c\epsilon ^2. \end{aligned}$$
(8.8)

By (8.7) and (8.8), replacing \(u_{\varepsilon _i}\) by \({\tilde{u}}\) in (8.5) produces error of \(c\epsilon ^2\). Similarly we have

$$\begin{aligned} \Big |\int _{t_1}^{t_2}\int _{\Omega }( -h\phi +\nabla \phi )\cdot ( (u-{\tilde{u}})\cdot \nu )\nu \, d\mu _tdt\Big | \le c'\epsilon . \end{aligned}$$
(8.9)

Thus we will finish the proof if we prove

$$\begin{aligned}&\liminf _{i\rightarrow \infty }\int _{t_1}^{t_2}\int _{\Omega } -\varepsilon _i \phi h_{\varepsilon _i}^2-\varepsilon _i h_{\varepsilon _i}\nabla \phi \cdot \nabla \varphi _{\varepsilon _i} +\varepsilon _i\phi h_{\varepsilon _i}{\tilde{u}}\cdot \nabla \varphi _{\varepsilon _i} \nonumber \\&\quad \, +~\varepsilon _i (\nabla \varphi _{\varepsilon _i}\cdot \nabla \phi )({\tilde{u}}\cdot \nabla \varphi _{\varepsilon _i})\, dxdt \le \int _{t_1}^{t_2}{\mathcal B}(\mu _t,{\tilde{u}}(\cdot ,t),\phi (\cdot ,t))\, dt, \quad \end{aligned}$$
(8.10)

where we denote

$$\begin{aligned} {\mathcal B}(\mu _t,{\tilde{u}}(\cdot ,t),\phi (\cdot ,t)):=\int _{\Omega } (\nabla \phi -h\phi )\cdot (h+(\tilde{u}\cdot \nu )\nu )\, d\mu _t. \end{aligned}$$

By the Cauchy-Schwarz inequality, we have

$$\begin{aligned} {\hat{a}}_i(t)&:=\varepsilon _i\int _{\Omega } -\phi h_{\varepsilon _i}^2- h_{\varepsilon _i}\nabla \phi \cdot \nabla \varphi _{\varepsilon _i} +\phi h_{\varepsilon _i}{\tilde{u}}\cdot \nabla \varphi _{\varepsilon _i} + (\nabla \varphi _{\varepsilon _i}\cdot \nabla \phi )({\tilde{u}}\cdot \nabla \varphi _{\varepsilon _i}) \nonumber \\&\le \int _{\Omega }\frac{\varepsilon _i}{2}|\nabla \varphi _{\varepsilon _i}|^2\left( \frac{|\nabla \phi |^2}{\phi }+\phi |{\tilde{u}}|^2 +2 |{\tilde{u}}||\nabla \phi |\right) \nonumber \\&\le \int _{\Omega }\frac{\varepsilon _i}{2}|\nabla \varphi _{\varepsilon _i}|^2( \hat{\phi }+\phi |{\tilde{u}}|^2 +2 |{\tilde{u}}||\nabla \phi |)=: {\hat{b}}_i(t), \end{aligned}$$
(8.11)

where \(\hat{\phi }\in C_c(\Omega ;{\mathbb R}^+)\) is chosen so that \(\frac{|\nabla \phi |^2}{\phi }\le \hat{\phi }\). This in particular shows \(\hat{b}_i(t)-\hat{a}_i(t)\ge 0\) for \(t_1\le t\le t_2\). Using the general fact that \(\liminf _{i\rightarrow \infty } (a_i+b_i)\le \limsup _{i\rightarrow \infty }a_i+\liminf _{i\rightarrow \infty }b_i\) and Fatou’s lemma, we have

$$\begin{aligned} \liminf _{i\rightarrow \infty }\int _{t_1}^{t_2} \hat{a}_i(t)\, dt\le & {} - \liminf _{i\rightarrow \infty } \int _{t_1}^{t_2} (\hat{b}_i(t)-\hat{a}_i(t))\, dt +\liminf _{i\rightarrow \infty } \int _{t_1}^{t_2} \hat{b}_i(t)\, dt\nonumber \\\le & {} - \int _{t_1}^{t_2} \liminf _{i\rightarrow \infty } (\hat{b}_i(t)-\hat{a}_i(t))\, dt +\liminf _{i\rightarrow \infty } \int _{t_1}^{t_2} \hat{b}_i(t)\, dt.\nonumber \\ \end{aligned}$$
(8.12)

Since \({\hat{b}}_i(t)\) converges to \(\frac{1}{2}\int _{\Omega }( \hat{\phi }+\phi |{\tilde{u}}|^2 +2 |{\tilde{u}}||\nabla \phi |)\, d\mu _t\) for all \(t_1\le t\le t_2\) and bounded uniformly, from (8.12) and the dominated convergence theorem we have

$$\begin{aligned} \liminf _{i\rightarrow \infty }\int _{t_1}^{t_2} \hat{a}_i(t)\, dt\le -\int _{t_1}^{t_2}\liminf _{i\rightarrow \infty }(-\hat{a}_i(t))\, dt. \end{aligned}$$
(8.13)

Thus we may finish the proof of (8.10) via (8.13) if we prove

$$\begin{aligned} -\liminf _{i\rightarrow \infty } (-\hat{a}_i(t))\le {\mathcal B}(\mu _t, {\tilde{u}}(\cdot ,t),\phi (\cdot ,t)) \end{aligned}$$
(8.14)

for a.e. \(t\in [t_1,t_2]\). Fix t such that the claim of Proposition 8.1 holds. Let \(\{\varepsilon _{i_j}\}_{j=1}^{\infty }\) be a subsequence such that

$$\begin{aligned} \liminf _{i\rightarrow \infty }(-\hat{a}_i(t))=\lim _{j\rightarrow \infty } (-\hat{a}_{i_j}(t)). \end{aligned}$$
(8.15)

We may choose a further subsequence (denoted by the same index) such that \(V_t^{\varepsilon _{i_j}}\rightarrow \sigma {\tilde{V}}_t\) as varifold. By the Cauchy-Schwarz inequality,

$$\begin{aligned} -\hat{a}_i(t)\ge \int _{\Omega }\frac{1}{2} \varepsilon _i \phi h_{\varepsilon _i}^2 -\left( \frac{|\nabla \phi |^2}{\phi }+|\tilde{u}|^2+|\tilde{u}||\nabla \phi |\right) \varepsilon _i |\nabla \varphi _{\varepsilon _i}|^2\, dx \end{aligned}$$
(8.16)

where the last negative term is bounded uniformly. If \(\liminf _{j\rightarrow \infty }\int _{\Omega }\varepsilon _{i_j}\phi h_{\varepsilon _{i_j}}^2\, dx\) is infinity, we have (8.14) with the left-hand side \(=-\infty \). Thus we may assume otherwise. At this point, arguing just as in the proof of Proposition 6.1, we may prove that \({\tilde{V}}_t\lfloor _{\{\phi >0\}}\) is rectifiable and \({\tilde{V}}_t\lfloor _{\{\phi >0\}} =V_t\lfloor _{\{\phi >0\}}\). Then the argument in the proof of Proposition 8.1 shows (8.1). For the remaining three terms in \(\hat{a}_{i_j}(t)\), since \(V_t^{\varepsilon _{i_j}}\lfloor _{\{\phi >0\}}\rightarrow \sigma V_t\lfloor _{\{\phi >0\}}\) as varifold and by (6.41), we have for any \(\tilde{\phi }\in C^2_c (\{\phi >0\}\,;\, {\mathbb R}^+)\)

$$\begin{aligned}&\lim _{j\rightarrow \infty } \varepsilon _{i_j}\int _{\Omega } h_{\varepsilon _{i_j}}\nabla \tilde{\phi }\cdot \nabla \varphi _{\varepsilon _{i_j}} - \tilde{\phi }h_{\varepsilon _{i_j}} \tilde{u}\cdot \nabla \varphi _{\varepsilon _{i_j}} - (\nabla \varphi _{\varepsilon _{i_j}}\cdot \nabla \tilde{\phi }) (\tilde{u}\cdot \nabla \varphi _{\varepsilon _{i_j}})\, dx \nonumber \\&\quad =\sigma \delta V_t(\nabla \tilde{\phi }-\tilde{u}\tilde{\phi })-\int _{\Omega }(\nabla \tilde{\phi }\cdot \nu )(\tilde{u}\cdot \nu )\, d\mu _t \nonumber \\&\quad =\int _{\Omega } -h\cdot (\nabla \tilde{\phi }-\tilde{u}\tilde{\phi })-(\nabla \tilde{\phi }\cdot \nu )(\tilde{u}\cdot \nu )\, d\mu _t. \end{aligned}$$
(8.17)

We may construct a sequence of approximation \(\{\tilde{\phi }_{k}\}_{k=1}^{\infty }\) such that \(\lim _{k\rightarrow \infty } \,\Vert \phi -\tilde{\phi }_k\Vert _{C^2}=0\), \(\phi \ge \tilde{\phi }_k\) and \(\mathrm{spt}\,\tilde{\phi }_k\subset \{\phi >0\}\). For such approximating sequence,

$$\begin{aligned}&\left| \int _{\Omega } \varepsilon _{i_j}h_{\varepsilon _{i_j}}\nabla (\phi -\tilde{\phi }_k)\cdot \nabla \varphi _{\varepsilon _{i_j}}\right| \nonumber \\&\le \left( \int _{\Omega }\varepsilon _{i_j} h_{\varepsilon _{i_j}}^2 \phi \right) ^{1/2}\left( \int _{\Omega } \frac{|\nabla (\phi -\tilde{\phi }_k)|^2}{\phi -\tilde{\phi }_k} \varepsilon _{i_j}|\nabla \varphi _{\varepsilon _{i_j}}|^2\right) ^{1/2}\nonumber \\&\le \left( \int _{\Omega }\varepsilon _{i_j} h_{\varepsilon _{i_j}}^2 \phi \right) ^{1/2}\left( 2 \Vert \phi -\tilde{\phi }_k\Vert _{C^2}\right) ^{1/2} (2\mu _{t}^{\varepsilon _{i_j}}(\Omega ))^{1/2}\rightarrow 0 \end{aligned}$$
(8.18)

as \(k\rightarrow \infty \) uniformly in j. The error of replacing \(\tilde{\phi }=\tilde{\phi }_k\) in (8.17) by \(\phi \) can be approximated similarly. Thus (8.17) holds also for \(\phi \) instead of \(\tilde{\phi }\). Recall that we have taken a subsequence so that (8.15) holds. Combined with (8.1) and (8.17) with \(\tilde{\phi }=\phi \), and recalling that \(h\cdot \tilde{u}=h\cdot (\tilde{u}\cdot \nu )\nu \) for \(\mu _t\) a.e. by Brakke’s perpendicularity theorem [6, Ch.5], we have proved (8.14). This concludes the proof. \(\square \)

We next discuss the proof of Theorem 2.2 (2).

Proposition 8.3

There exists a further subsequence (denoted by the same index) \(\{ \varphi _{\varepsilon _i} \}_{i=1}^{\infty }\) and a function \(\varphi \in BV_{loc}(\Omega \times [0,\infty ))\cap C^{\frac{1}{2}}_{loc}([0,\infty );L^1(\Omega ))\) such that for all \(t\ge 0\),

$$\begin{aligned} w_{\varepsilon _i}(\cdot ,t)\rightarrow \varphi (\cdot ,t) \end{aligned}$$
(8.19)

strongly in \(L^1_{loc}(\Omega )\) and \(\varphi \) satisfies the properties of Theorem 2.2 (2). Here \(w_{\varepsilon _i}\) is defined by

$$\begin{aligned} w_{\varepsilon _i} := \Phi \circ \varphi _{\varepsilon _i} \text{ with } \Phi (s): = \sigma ^{-1}\int _{-1} ^s \sqrt{2W(y)} \, dy. \end{aligned}$$

Proof

Note that \(\Phi (1)=1\) and \(\Phi (-1)=0\). We compute

$$\begin{aligned} |\nabla w_{\varepsilon _i}|=\sigma ^{-1} |\nabla \varphi _{\varepsilon _i} | \sqrt{2W(\varphi _{\varepsilon _i})} \le \sigma ^{-1} \left( \frac{\varepsilon _i |\nabla \varphi _{\varepsilon _i} |^2}{2} +\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i} \right) . \end{aligned}$$

Fix \(T>0\). For all sufficiently large i, by (4.13) we have

$$\begin{aligned} \int _\Omega |\nabla w_{\varepsilon _i}(\cdot ,t )| \, dx \le \int _\Omega \sigma ^{-1} \left( \frac{\varepsilon _i |\nabla \varphi _{\varepsilon _i} |^2}{2} +\frac{W(\varphi _{\varepsilon _i} )}{\varepsilon _i} \right) \, dx \le \sigma ^{-1}D_1 \end{aligned}$$
(8.20)

for any \(t\in [0,T]\). By the similar argument we have

$$\begin{aligned}&\int _0 ^T \int _\Omega |\partial _t w_{\varepsilon _i} | \, dxdt \le \sigma ^{-1} \int _0 ^T \int _\Omega \left( \frac{\varepsilon _i |\partial _t \varphi _{\varepsilon _i} |^2}{2} +\frac{W(\varphi _{\varepsilon _i} )}{\varepsilon _i} \right) \, dxdt \nonumber \\&\quad \le \sigma ^{-1} \int _0 ^T \int _\Omega \varepsilon _i \left\{ (u_{\varepsilon _i}\cdot \nabla \varphi _{\varepsilon _i} )^2 +\left( \Delta \varphi _{\varepsilon _i} -\frac{W^{\prime }(\varphi _{\varepsilon _i})}{\varepsilon _i} \right) ^2 \right\} \, dxdt \nonumber \\&\quad +\,\, \sigma ^{-1} \int _0 ^T \int _\Omega \frac{W(\varphi _{\varepsilon _i} )}{\varepsilon _i} \, dxdt, \end{aligned}$$
(8.21)

and the last quantity is uniformly bounded due to Lemma 4.4. By (8.20) and (8.21) \(\{w_{\varepsilon _i}\} _{i= 1}^{\infty }\) is bounded in \(BV_{loc}(\Omega \times [0,T])\). By the standard compactness theorem and a diagonal argument, there exists a subsequence (denoted by the same index) \(\{w_{\varepsilon _i}\} _{i=1}^{\infty }\) and \(w \in BV_{loc}(\Omega \times [0,\infty ))\) such that

$$\begin{aligned} w_{\varepsilon _i} \rightarrow w \quad \text {strongly in }L^1_{loc}(\Omega \times [0,\infty )) \end{aligned}$$
(8.22)

and a.e. pointwise. We set \(\varphi :=(1+\Phi ^{-1} \circ w)/2\). We have

$$\begin{aligned} \varphi _{\varepsilon _i} \rightarrow 2\varphi -1 \quad \text {a.e. in }\Omega \times [0,\infty )\end{aligned}$$

and by this with \(|\varphi _{\varepsilon _i}|\le 1\) we obtain

$$\begin{aligned} \varphi _{\varepsilon _i} \rightarrow 2\varphi -1 \quad \text {in }L^1_{loc}( \Omega \times [0,\infty )). \end{aligned}$$

Due to the uniform bound on \(\int _{\Omega }\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i}\, dx\), one can prove by Fatou’s lemma that \(\varphi _{\varepsilon _i}\rightarrow \pm 1\) for a.e. (xt) and hence \(\varphi =1\) or \(=0\) a.e. on \(\Omega \times [0,\infty )\). In particular, since \(\varphi =1\iff w=1\) and \(\varphi =0\iff w=0\), we have \(w=\varphi \) on \(\Omega \times [0,\infty )\). This in particular proves the \(BV_{loc}(\Omega \times [0,\infty ))\) property of \(\varphi \). For a.e. \(0\le t_1 < t_2 \le T\) and any open set \(U\subset \subset \Omega \), we have

$$\begin{aligned}&\int _{U} |\varphi (\cdot ,t_1 )-\varphi (\cdot , t_2 )| \, dx = \lim _{i\rightarrow \infty } \int _{U} |w_{\varepsilon _i}(\cdot ,t_1 )- w_{\varepsilon _i}(\cdot ,t_2)| \, dx \\&\quad \le \liminf _{i\rightarrow \infty } \int _{U} \int _{t_1} ^{t_2} |\partial _t w_{\varepsilon _i}| \, dtdx \\&\quad \le \liminf _{i\rightarrow \infty } \sigma ^{-1} \int _\Omega \int _{t_1} ^{t_2} \left( \frac{\varepsilon _i |\partial _t \varphi _{\varepsilon _i} |^2}{2} \sqrt{t_2-t} +\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i \sqrt{t_2-t}} \right) \, dtdx. \end{aligned}$$

Note that the right-hand side does not depend on U. Thus, by the similar argument to (8.21) we have with

$$\begin{aligned} \int _\Omega |\varphi (\cdot , t_1)-\varphi (\cdot ,t_2)| \, dx \le c \sqrt{t_2-t_1} . \end{aligned}$$
(8.23)

Since \((1+\varphi _{\varepsilon _i}(\cdot ,0))/2\rightarrow \chi _{\Omega _0}\) by (5.6), we have (2c). We assumed that \(\Omega _0\) is a bounded domain, hence, (8.23) shows that \(\varphi (\cdot ,t)\in L^1(\Omega )\) for a.e. \(t\ge 0\). Moreover, we may define \(\varphi (\cdot ,t)\) as a characteristic function for all \(t\ge 0\) so that \(\varphi \in C^{\frac{1}{2}}_{loc} ([0,\infty ) ;L^1 (\Omega ))\) due to (8.23). This proves (2a) and \(C^{\frac{1}{2}}_{loc}\) property for \(\varphi \). From (8.22), for a.e. \(t\ge 0\), \(w_{\varepsilon _i}(\cdot ,t)\rightarrow \varphi (\cdot ,t)\) in \(L^1_{loc}(\Omega )\) strongly. Using (8.23), one can show by a simple telescopic argument that the convergence is true for all \(t\ge 0\) instead of a.e. t, which proves (8.19). By the standard lower semicontinuity property of BV norm, for any \(\phi \in C_c(\Omega ;{\mathbb R}^+)\) and \(0\le t<\infty \), we have

$$\begin{aligned}&\int _\Omega \phi \, d\Vert \nabla \varphi (\cdot ,t)\Vert \le \liminf _{i\rightarrow \infty } \int _\Omega \phi |\nabla w_{\varepsilon _i} | \, dx \\&\quad \le \lim _{i \rightarrow \infty } \sigma ^{-1} \int _\Omega \left( \frac{\varepsilon _i |\nabla \varphi _{\varepsilon _i} |^2}{2} +\frac{W(\varphi _{\varepsilon _i})}{\varepsilon _i} \right) \phi \, dx = \int _\Omega \phi \, d\Vert V_t\Vert . \end{aligned}$$

This proves (2b).

To prove (2d), we consider the a.e. \(t\ge 0\) for which we have proved the integrality of \(V_t\). Writing \(\Vert V_t\Vert =\theta {\mathcal H}^{n-1}\lfloor _{M_t}\), we already know that \(\theta \) is integer-valued \(\Vert V_t\Vert \) a.e. and that \(M_t\) is countably \((n-1)\)-rectifiable. In addition, by (2.8), we have \(1\le \theta \le N(t)\), \({\mathcal H}^{n-1}\) a.e. on \(M_t\) for some integer N(t). The latter shows in particular that

$$\begin{aligned} {\mathcal H}^{n-1}\lfloor _{M_t}\le \Vert V_t\Vert \le N(t){\mathcal H}^{n-1}\lfloor _{M_t}. \end{aligned}$$
(8.24)

By (2a) and (2b), we know that \(\Vert \nabla \varphi (\cdot ,t)\Vert ={\mathcal H}^{n-1}\lfloor _{\tilde{M}_t}\) for some countably \((n-1)\)-rectifiable set by De Giorgi’s theorem (see [24, 4.4]). To prove (2.16), assume the contrary. Then by the standard argument (see [41, 3.5]), there would be a point \(x\in \tilde{M}_t{\setminus } M_t\) with \(\lim _{r\downarrow 0}{\mathcal H}^{n-1}(B_r(x)\cap \tilde{M}_t)/\omega _{n-1}r^{n-1}=1\) while \(\lim _{r\downarrow 0}{\mathcal H}^{n-1}(B_r(x)\cap M_t)/\omega _{n-1}r^{n-1}=0\). Then, using also (8.24), one would then have a contradiction to Theorem 2.2 (2b). Thus we have (2.16).

To prove (2.17), we closely follow the proof of integrality again. We already know that for \(\Vert V_t\Vert \) a.e. x, we have the properties described in the proof of Theorem 7.1. By the well-known property of set of finite perimeter ([24, 3.8]), for \({\mathcal H}^{n-1}\) a.e. \(x\in \tilde{M}_t\), the blow-up limit of \(\varphi \) centered at x is supported by a half-space. For \({\mathcal H}^{n-1}\) a.e. \(x\in \Omega {\setminus } \tilde{M}_t\) (in particular on \(M_t{\setminus } \tilde{M}_t\)), the blow-up limit centered at x is a constant function with value either 0 or 1. By (8.19), up to \({\mathcal H}^{n-1}\) null set, we may assume in addition to the properties of \(\{V_t^{\varepsilon _i}\}_{i=1}^{\infty }\) in the proof of Theorem 7.1 that \(\tilde{w}_{\varepsilon _i}(\tilde{x}):=w_{\varepsilon _i}(r_i \tilde{x})\) converges strongly in \(L^1_{loc}({\mathbb R}^n)\) and pointwise \({\mathcal L}^{n}\) a.e. to \(\chi _{\{x_n\ge 0\}}\) (or \(\chi _{\{x_n\le 0\}}\)) if \(x=0\) is in \(\tilde{M}_t\), or to 1 (or 0) if \(x=0\) is in \(M_t{\setminus } \tilde{M}_t\). Since the proof for other cases is similar, we only discuss the case of \(\tilde{M}_t\) and \(\lim _{i\rightarrow \infty } \tilde{w}_{\varepsilon _i}=\chi _{\{x_n\ge 0\}}\) in the following. In terms of \(\varphi _{\varepsilon _i}\) (which is the relabeling of \(\tilde{\varphi }_{\varepsilon _i}\)), note that this means that \(\varphi _{\varepsilon _i}\) converges a.e. to \(\chi _{\{x_n\ge 0\}}-\chi _{\{x_n<0\}}\).

As one follows the proof of Theorem 7.1, the difference occurs at (7.76), where we already know that \(\theta \) is an integer multiple of \(\sigma \). So let \(N-1:=\sigma ^{-1} \theta (\ge 1)\). We want to conclude that N is an even integer. We follow the proof until (7.89), and at this point, define for \(i\in {\mathbb N}\) (and writing \(Y(\tau ,x):=\{\varphi _{\varepsilon _i}=\tau \}\cap B_1\cap G_i\cap P^{-1}(x)\))

$$\begin{aligned}&\tilde{A}_i := \{x\in B_1^{n-1}\,:\, \forall \tau \in (-1+b,1-b)\Rightarrow {\mathcal H}^0 (Y(\tau ,x)) \le N-2\},\nonumber \\&A_i:=\{x\in B_1^{n-1}\,:\, \exists \tau \in (-1+b,1-b)\Rightarrow {\mathcal H}^0 (Y(\tau ,x)) = N-1\}.\qquad \end{aligned}$$
(8.25)

We know from (7.82) that \({\mathcal H}^0(Y(\tau ,x))\) has to be \(\le N-1\), thus, \(B_1^{n-1} =\tilde{A}_i\cup A_i\) and

$$\begin{aligned} {\mathcal H}^{n-1}(\tilde{A}_i)=\omega _{n-1}-{\mathcal H}^{n-1}(A_i) \end{aligned}$$
(8.26)

for all sufficiently large i. In (7.90), we have

$$\begin{aligned}&\omega _{n-1}\sigma (N-1) \le 2s+\liminf _{i\rightarrow \infty } \int _{-1+b}^{1-b} \sqrt{2W(\tau )} \{(N-2) {\mathcal H}^{n-1} (\tilde{A}_i) \nonumber \\&\quad +(N-1){\mathcal H}^{n-1}(A_i)\}\, d\tau \le 2s+(N-2)\sigma \omega _{n-1}+\sigma \liminf _{i\rightarrow \infty } {\mathcal H}^{n-1}(A_i)\qquad \qquad \end{aligned}$$
(8.27)

where we used (8.26). Thus we have from (8.27)

$$\begin{aligned} \omega _{n-1}-2\sigma ^{-1} s\le \liminf _{i\rightarrow \infty } {\mathcal H}^{n-1}(A_i). \end{aligned}$$
(8.28)

By (7.20), for all sufficiently large i and any point \(x\in A_i\), the image \(\varphi _{\varepsilon _i}(B_1\cap P^{-1}(x))\) covers \([-1+b,1-b]\) at least \(N-1\) times. The each covering is monotone, thus we know that \(\varphi _{\varepsilon _i}(y)\) as y moves from \(P^{-1}(x)\cap \{x_n=-s\}\) to \(P^{-1}(x)\cap \{x_n=s\}\) along \(P^{-1}(x)\) has to go up and down between \(-1+b\) and \(1-b\) at least \(N-1\) times. Next, since \(\varphi _{\varepsilon _i}\) converges a.e. pointwise to \(\chi _{\{x_n\ge 0\}}- \chi _{\{x_n<0\}}\), by Egoroff’s Theorem and then Fubini’s Theorem, there exists \(s_1\in [s,2s]\), \(s_2\in [-2s,-s]\), \(C_1\subset B_1^{n-1}\) and \(C_2\subset B_1^{n-1}\) such that \(\varphi _{\varepsilon _i}\) converges uniformly to 1 on \(C_1\times \{s_1\}\) and to \(-1\) on \(C_2\times \{s_2\}\) while

$$\begin{aligned} {\mathcal H}^{n-1}(C_i)\ge \omega _{n-1} -s\ \ \text{ for } i=1,2. \end{aligned}$$
(8.29)

Set \(C_3=C_1\cap C_2\) so that, by (8.29),

$$\begin{aligned} {\mathcal H}^{n-1}(C_3)\ge \omega _{n-1}-2s. \end{aligned}$$
(8.30)

Now, for a contradiction, assume that N is odd. For \(x\in A_i\cap C_3\), consider the image of \(\varphi _{\varepsilon _i}\) on \(\{(x,x_n)\,:\, x_n\in [s_2,s_1]\}\). By the uniform convergence and \(x\in C_3\), for sufficiently large i, \(\varphi _{\varepsilon _i}(x,s_2)<-1+b\) and \(\varphi _{\varepsilon _i}(x,s_1)>1-b\). Since \(\varphi _{\varepsilon _i}\) is continuous, image of \(\varphi _{\varepsilon _i}\) having at least even \(N-1\) covering of \([-1+b,1-b]\) implies that there has to be at least another covering of \([-1+b,1-b]\). Thus, for each \(\tau \in [-1+b,1-b]\) and \(x\in A_i\cap C_3\), we have

$$\begin{aligned} {\mathcal H}^{0}(\{x_n\in [s_2,s_1]\,:\, \varphi _{\varepsilon _i}(x,x_n)=\tau \})\ge N. \end{aligned}$$
(8.31)

Then by the coarea formula and (8.31), we have

$$\begin{aligned}&\int _{s_2}^{s_1} \sqrt{2W(\varphi _{\varepsilon _i}(x,x_n))}|\partial _{x_n}\varphi _{\varepsilon _i}(x,x_n)|\, dx_n\nonumber \\&\quad =\int _{-1}^1 \sqrt{2W(\tau )} {\mathcal H}^0 (\{x_n\in [s_2,s_1]\,:\,\varphi _{\varepsilon _i}(x,x_n)=\tau \})\, d\tau \nonumber \\&\quad \ge N \int _{-1+b}^{1-b} \sqrt{2W(\tau )}\, d\tau . \end{aligned}$$
(8.32)

Note that by (8.28) and (8.30), we have for sufficiently large i

$$\begin{aligned} {\mathcal H}^{n-1}(A_i\cap C_3)\ge \omega _{n-1}-(3+2\sigma ^{-1})s. \end{aligned}$$
(8.33)

Integrating (8.32) over \(A_i\cap C_3\) and (8.33) give

$$\begin{aligned} \int _{B_1}\sqrt{2W(\varphi _{\varepsilon _i})}|\nabla \varphi _{\varepsilon _i}|&\ge \int _{(A_i\cap C_3)\times [s_2,s_1]}\sqrt{2W(\varphi _{\varepsilon _i})}|\partial _{x_n}\varphi _{\varepsilon _i}| \nonumber \\&\ge (\omega _{n-1}-(3+2\sigma ^{-1})s) N \int _{-1+b}^{1-b}\sqrt{2W(\tau )}\,d\tau .\qquad \end{aligned}$$
(8.34)

We may choose b so that \(\int _{-1+b}^{1-b}\sqrt{2W(\tau )}\, d\tau \ge \sigma -s\). On the other hand, by (7.67), we have

$$\begin{aligned} \int _{B_1}\sqrt{2W(\varphi _{\varepsilon _i})}|\nabla \varphi _{\varepsilon _i}|\, dx\le \int _{B_1} \frac{\varepsilon _i |\nabla \varphi _{\varepsilon _i}|^2}{2}+\frac{W}{\varepsilon _i}\, dx\rightarrow \omega _{n-1}(N-1)\sigma .\quad \end{aligned}$$
(8.35)

For sufficiently small s depending only on nN and \(\sigma \), (8.34) and (8.35) lead to a contradiction. This proves N has to be even. As we mentioned, other cases of \(\varphi \) being constant (either 0 or 1) can be similarly proved. This concludes the proof of (2.17) and (2d). \(\square \)

We next verify

Proposition 8.4

The function u satisfies the property of Theorem 2.2 (3).

Proof

Consider the case \(p<n\) and fix \(T>0\). Since

\(\lim _{i\rightarrow \infty }\Vert u_{\varepsilon _i}-u\Vert _{L^q ([0,T];(W^{1,p})^n )}=0\), \(\{u_{\varepsilon _i}\}\) is a Cauchy sequence in this norm. By (2.11) with \(s=\frac{p(n-1)}{n-p}\), we have

$$\begin{aligned} \int _0^T\, dt\left( \int _{\Omega } |u_{\varepsilon _i}-u_{\varepsilon _j}|^{s}\, d\Vert V_t\Vert \right) ^{\frac{q}{s}} \le c(n,p,q,D_1) \Vert u_{\varepsilon _i}-u_{\varepsilon _j}\Vert _{L^q([0,T];(W^{1,p}(\Omega ))^n)}^q.\nonumber \\ \end{aligned}$$
(8.36)

By a standard argument, we may subtract a subsequence \(\{u_{\varepsilon _{i_j}}\}_{j=1}^{\infty }\) which converges pointwise \(\Vert V_t\Vert \times dt\) a.e. on \(\Omega \times [0,T]\) to an element of

\(L^q([0,T];(L^s (\Vert V_t\Vert ))^n)\). This limit function is uniquely determined by u independent of the approximate sequence and (2.18) holds. For \(p=n\), we apply the same argument locally for \(p^{\prime }<n\) which gives (2.18) with any \(2\le s<\infty \). For \(p>n\), the standard Sobolev inequality and the Hölder inequality prove the claim immediately. \(\square \)

To conclude the proof of Theorem 2.2 we prove

Proposition 8.5

We have \(T_1>0\) with the property described in Theorem 2.2 (4).

Proof

By integrality, we already know that \(\Vert V_t\Vert =\theta {\mathcal H}^{n-1}\lfloor _{M_t}\) for a.e. \(t\ge 0\), where \(\theta \) is integer-valued \({\mathcal H}^{n-1}\) a.e. on \(M_t\). Thus we should prove that \(\mathcal {H}^{n-1} (\{\theta (\cdot ,t) \ge 2 \})=0\) for a.e. \(0< t< T_1\) for some \(T_1>0\). We will determine the lower bound of \(T_1\) in the following. Assume there exist \(0<\hat{t}<T_1\) and \(\hat{x} \in M_{\hat{t}}\) such that \(M_{\hat{t}}\) has the approximate tangent space at \(\hat{x}\) and the density \(\theta (\hat{x},\hat{t})\ge 2\). Then it is not difficult to check that

$$\begin{aligned} \lim _{r\rightarrow 0} \int _\Omega \tilde{\rho }_{(\hat{x},\hat{t}+r^2)} \, d\Vert V_{\hat{t}}\Vert =\theta (\hat{x},\hat{t})\ge 2. \end{aligned}$$
(8.37)

Since \(\Vert V_0\Vert ={\mathcal H}^{n-1}\lfloor _{M_0}\) and \(M_0\) is \(C^1\), we have

$$\begin{aligned} \int _\Omega \tilde{\rho }_{(x,t)} \,d\Vert V_0\Vert \le 3/2 \end{aligned}$$
(8.38)

for any \((x,t)\in \Omega \times (0,T_1]\), where \(T_1\) depends only on \(M_0\). We then use (4.90) with \(\varepsilon \rightarrow 0\). We then have

(8.39)

and the right hand side of (8.39) may be made smaller than 1 / 2 by restricting \(T_1\). Then we would have a contradiction since the left-hand side is \(\ge 1/2\) due to (8.37) and (8.38). This proves the first part of (4). We next prove \(\Vert \nabla \varphi (\cdot ,t)\Vert =\Vert V_t\Vert \) a.e. \(t\in [0,T_1]\). With the notation of (2d), for a.e. \(t\in [0,T_1]\), we have \(\Vert V_t\Vert ={\mathcal H}^{n-1}\lfloor _{M_t}\) since \(\theta =1\) a.e. from the first part. But then, by (2.17), \({\mathcal H}^{n-1}(M_t{\setminus } \tilde{M}_t)=0\) since \(\theta =1\) and odd. Thus combined with (2.16), \(\tilde{M}_t=M_t\) modulo null set, and this shows the claim. We may take \(T_1\) to be \(\sup \{t>0\,:\, V_t \text{ is } \text{ unit } \text{ density } \text{ for } \text{ a.e }. t\in [0,t]\}\). \(\square \)

As for the proof of Theorem 2.3, (1) and (3) follow from [30] and [46], respectively, which give criterion for partial \(C^{1,\zeta }\) and \(C^{2,\alpha }\) regularity. For (1), we check that [30, Sect. 3.1 (A1)–(A4)] are all satisfied. Namely, (A1) asks \(V_t\) to be unit density for a.e. t, (A2) is on the uniform density ratio upper bound which follows from (2.8), (A3) is on the integrability of u which is given by (2.18) and (A4) is the flow equation which is (2.10). If \(p<n\), the exponent of integrability of u in (2.18) has to satisfy \(\zeta :=1-(n-1)/s-2/q=2-n/p-2/q>0\), and this follows from (2.14). If \(p\ge n\), we may choose any \(s>(n-1)q/(q-2)\) in (2.18) so that we have \(0<\zeta \), and we may take sufficiently large s so that \(0<\zeta <1-2/q\) can be arbitrarily close to \(1-2/q\). This proves (1). The conclusion for \(C^{2,\alpha }\) is precisely the claim of [46]. Thus we only need to prove (2) and (4).

Proposition 8.6

The family of varifolds \(\{V_t\}_{t\ge 0}\) satisfies the property of Theorem 2.3 (2) and (4)

Proof

For a.e. \(0\le t<T_1\), we have proved that \(V_t\) has unit density property, thus we may use results in [30] for \(\{V_t\}_{0\le t<T_1}\). We first claim that there exists \(0<T_3\le T_1\) depending only on \(D_1,n,p,q,\Vert u\Vert _{L^q([0,T_1]; (W^{1,p}(\Omega ))^n)}\) (\(D_1\) corresponding to \(T_1\)) and such that

(8.40)

for a.e. \(0\le t\le T_3\). For the proof, we use [30, Proposition6.2]. Citing the result for the convenience of the reader, we have for \(x\in \Omega \) and \(0<r<1\)

$$\begin{aligned} \int _{B_r(x)} \hat{\rho }_{(x,t+\epsilon )}(\cdot ,t)\, d\Vert V_t\Vert&-\int _{B_r(x)}\hat{\rho }_{(x,t+\epsilon )}(\cdot ,0)\, d\Vert V_0\Vert \nonumber \\&\le c(n,s,q) \Vert u\Vert ^{2}_{L^{s,q}} D_1^{1-\frac{2}{s}}t^{\zeta }+c(n)D_1 r^{-2} t,\qquad \end{aligned}$$
(8.41)

where \(s:=\frac{p(n-1)}{n-p}\) if \(p<n\) and any \(\frac{(n-1)q}{q-2}< s<\infty \) if \(p\ge n\), \(\zeta =1-(n-1)/s-2/q\) and \(\Vert u\Vert _{L^{s,q}}:=(\int _0^t (\int _{B_r(x)} |u|^{s}\, d\Vert V_{\lambda }\Vert )^{q/s}\,d\lambda )^{1/q}\). \(\hat{\rho }_{(x,t+\epsilon )}\) is \(\rho _{(x,t+\epsilon )}\) times a radially symmetric cut-off function with support in \(B_{14r/15}(x)\) and \(=1\) near x. Note that \(\Vert u\Vert _{L^{s,q}}\) may be bounded in terms of \(D_1\) and \(\Vert u\Vert _{L^q([0,T_1];(W^{1,p}(\Omega ))^n)}\) as was done for the proof of (2.18). By restricting \(T_3\) small, we may conclude from (8.41) that

$$\begin{aligned} \int _{B_r(x)}\hat{\rho }_{(x,t+\epsilon )}(\cdot ,t)\, d\Vert V_{t}\Vert -\int _{B_r(x)} \hat{\rho }_{(x,t+\epsilon )}(\cdot , 0)\, d\Vert V_0\Vert \le \frac{1}{2} +c(n)D_1 r^{-2}t.\qquad \quad \end{aligned}$$
(8.42)

Let be a constant to be fixed shortly and assume that there exists \(x\in \mathrm{spt}\, \Vert V_t\Vert \) such that and \(0<t\le T_3\). We may assume that \(V_t\) is unit density and has approximate tangent space with multiplicity 1 at x, since such time and point are generic. In particular, one can check that \( \lim _{\epsilon \rightarrow 0+}\int _{B_r(x)}\hat{\rho }_{(x,t+\epsilon )}(\cdot ,t)\, d\Vert V_t\Vert =1\) and (8.42) thus shows

$$\begin{aligned} \frac{1}{2}-\int _{B_r(x)} \hat{\rho }_{(x,t)}(\cdot , 0)\, d\Vert V_0\Vert \le c(n)D_1 r^{-2}t . \end{aligned}$$
(8.43)

We now choose . Since \(B_r(x)\cap M_0=\emptyset \), the integral in (8.43) is 0. Hence we obtain . If we choose a sufficiently large depending only on n and \(D_1\), we obtain a contradiction. This proves (8.40).

Next, since \(\mathrm{spt}\,\Vert \nabla \varphi (\cdot ,t)\Vert \subset \mathrm{spt}\,\Vert V_t\Vert \) by Theorem 2.2 (2b), (8.40) shows that \(\varphi (\cdot ,t)\) is a constant function on each connected component of for a.e. \(0\le t\le T_3\). Since \(\varphi (\cdot ,t)\) is a characteristic function and is continuous in \(L^1\) norm with respect to time, one sees that

(8.44)

for all \(0\le t\le T_3\). We now estimate the location of \(\mathrm{spt}\,\Vert V_t\Vert \) during the short initial time. Since \(M_0\) is assumed to be \(C^1\), there exists \(r_1>0\) such that, for each \(x\in M_0\) (we may assume that x is the origin and \(T_x M_0={\mathbb R}^{n-1}\times \{0\}\) after parallel translation and orthogonal rotation), \(M_0\) is locally represented as a \(C^1\) graph \(g:B_{r_1}^{n-1}\rightarrow {\mathbb R}\) on \(B_{r_1}^{n-1}\times (-r_1,r_1)\). We take the coordinate system so that \(\Omega _0\) is located on the upper side, above the graph of g. We may also restrict \(r_1\) (uniformly on \(M_0\)) so that for all \(r\le r_1\), we have

$$\begin{aligned} \sup _{x\in B_{r}^{n-1}} |g(x)|\le \frac{r}{10}. \end{aligned}$$
(8.45)

For , (8.44) and (8.45) show that

$$\begin{aligned} \varphi (\cdot ,t)&=1\ \ \text{ on } B_{9r/10}^{n-1}\times [r/5,r_1),\nonumber \\ \varphi (\cdot ,t)&=0 \ \ \text{ on } B_{9r/10}^{n-1}\times (-r_1,-r/5]. \end{aligned}$$
(8.46)

Next we use [30, Theorem 8.7]. Using the notation there, corresponding to \(1\le E_1<\infty \), \(0<\nu <1\), pq with \(1-(n-1)/p-2/q>0\), there exist 4 constants (\(\varepsilon _6,\sigma ,\Lambda _3,c_{19}\) in [30]) with the stated properties. Here, we use \(E_1=D_1\), \(\nu =1/2\), \(p=s\) above and the same q. The condition \(1-(n-1)/p-2/q>0\) is then satisfied. To avoid confusion in the following, we denote the constants in [30] corresponding to these choices by \(\varepsilon _{6,KT}, \sigma _{KT}, \Lambda _{3,KT},c_{19,KT}\). In the following, let \(P\in \mathbf{G}(n,n-1)\) be the projection \({\mathbb R}^n\rightarrow {\mathbb R}^{n-1}\times \{0\}\) and \(P^{\perp }\) be its orthogonal complement. We then use [30, Proposition 6.5] with

$$\begin{aligned} \Lambda = \Lambda _{3,KT}/18 \end{aligned}$$
(8.47)

to obtain \(c_{6,KT}\) with the property that

$$\begin{aligned}&\frac{1}{r^{n+1}} \int _{B_r} |P^{\perp }(x)|^2\, d\Vert V_t\Vert \le \exp (1/(4\Lambda )) \frac{1}{r^{n+1}}\int _{B_{Lr}} |P^{\perp }(x)|^2\, d\Vert V_0\Vert \nonumber \\&\quad +\,c_{6,KT}\{(r^{2\zeta } \Vert u\Vert _{L^{s,q}}^2+r^{\zeta }\Vert u\Vert _{L^{s,q}})L^2+L^{n+1}\exp (-(L-1)^2/(8\Lambda ))\}\qquad \quad \end{aligned}$$
(8.48)

for all \(t\in [0,\Lambda r^2]\) provided \(2\le L<\infty \) and \(rL\le 1\). Here \(c_{6,KT}\) depends only on \(s,q, D_1, \Lambda _{3,KT}\) but not on L. Given \(1>\varepsilon >0\), we may choose \(L\ge 2\) so that

$$\begin{aligned} c_{6,KT}L^{n+1}\exp (-(L-1)^2/(8\Lambda ))<\varepsilon \end{aligned}$$
(8.49)

and then choose \(r_2\le L^{-1}\) uniformly on \(M_0\) so that (using \(M_0\) is \(C^1\))

$$\begin{aligned}&\exp (1/(4\Lambda )) \sup _{0<r\le r_2} \frac{1}{r^{n+1}} \int _{B_{Lr}} |P^{\perp }(x)|^2\, d\Vert V_0\Vert < \varepsilon , \nonumber \\&\quad c_{6,KT}\left( r_2^{2\zeta }\Vert u\Vert _{L^{s,q}}^2+r_2^{\zeta }\Vert u\Vert _{L^{s,q}}\right) L^2<\varepsilon . \end{aligned}$$
(8.50)

The inequalities (8.48)–(8.50) gives for \(r\le r_2\) and \(t\in [0,\Lambda r^2]\)

$$\begin{aligned} \frac{1}{r^{n+1}}\int _{B_r}|P^{\perp }(x)|^2\, d\Vert V_t\Vert \le 3\varepsilon . \end{aligned}$$
(8.51)

We next use [30, Proposition 6.4] on \(B_r\times [0,\Lambda r^2]\) with a slight modification. Instead of obtaining result on the time interval \([R^2/5,\Lambda ]\) as in [30], we modify the proof so that we obtain the similar estimate on the time interval . This is achieved by a simple replacement of the cut-off function. We have a different constants which depends also on . Citing the result from [30, Proposition6.4], we obtain

(8.52)

where

$$\begin{aligned} \mu ^2:= \frac{c_{5,KT}}{r^{n+3}}\int _0^{\Lambda r^2}\int _{B_r} |P^{\perp }(x)|^2\, d\Vert V_t\Vert dt +c_{2,KT} \Vert u\Vert _{L^{s,q}}^2D_1^{1-\frac{2}{s}} \Lambda ^{\zeta } r^{2\zeta } (2+\Lambda )\nonumber \\ \end{aligned}$$
(8.53)

and where \(c_{5,KT}\) and \(c_{2,KT}\) depend only on nsq and . If we restrict \(r_2\) further so that the second term of (8.53) is smaller than \(\varepsilon \), (8.51)-(8.53) with sufficiently small \(\varepsilon \) gives

(8.54)

Combining (8.46) and (8.54), and using the \(L^1\) continuity of \(\varphi (\cdot ,t)\), we obtain

$$\begin{aligned}&\varphi (\cdot ,t)=1 \ \ \text{ on } B_{4r/5}\cap \{P^{\perp }(x)\ge r/5\},\nonumber \\&\varphi (\cdot ,t)=0 \ \ \text{ on } B_{4r/5}\cap \{P^{\perp }(x)\le -r/5\} \end{aligned}$$
(8.55)

for \(t\in [0,\Lambda r^2]\). Since \(B_{r/2}^{n-1}\times [-r/2,r/2]\subset B_{4r/5}\), (8.55) shows

$$\begin{aligned}&\varphi (\cdot , t)=1 \ \ \text{ on } B_{r/2}^{n-1}\times [r/5,r/2],\nonumber \\&\varphi (\cdot , t)=0 \ \ \text{ on } B_{r/2}^{n-1}\times [-r/2,-r/5],\nonumber \\&\mathrm{spt}\, \Vert V_t\Vert \cap (B_{r/2}^{n-1}\times [-r/2,r/2])\subset B_{r/2}^{n-1}\times [-r/5,r/5] \end{aligned}$$
(8.56)

for \(t\in [0,\Lambda r^2]\) and \(r\le r_2\). At this point, because of the third claim of (8.56), by setting \(V_t=0\) on \(B_{r/2}^{n-1}\times ({\mathbb R}{\setminus } [-r/2,r/2])\), we may assume that \(\{V_t\}_{0\le t\le \Lambda r^2}\) satisfies (2.10) on \((B_{r/2}^{n-1} \times {\mathbb R})\times [0,\Lambda r^2]\). We next want to apply [30, Theorem 8.7] with \(R:=r/6\). For the application, we need to check the conditions (8.83)–(8.86) of [30]. The first condition (8.83), the smallness of space-time \(L^2\)-height may be achieved due to (8.51), (8.56) and by restricting \(\varepsilon \) depending on \(\varepsilon _{6,KT}\) and \(\Lambda _{3,KT}\). The second condition (8.84), the smallness of \(\Vert u\Vert \), may be achieved by simply restricting \(r_2\). Thus we need to check the last two conditions, (8.85) and (8.86) of [30]. Let \(\phi _{P,R}\) and \(\mathbf{c}\) be defined as in [30, Definition 5.1]. We need to show that (recall that we have set \(\nu =1/2\))

$$\begin{aligned} \exists t_1\in (3R^2 /2,2R^2 )\ \ : \ \ R^{-(n-1)} \Vert V_{t_1}\Vert (\phi _{P,R}^2)<\frac{3}{2} \mathbf{c} \end{aligned}$$
(8.57)

and

$$\begin{aligned} \exists t_2\in ((2 \Lambda _{3,KT}-2)R^2 , (2\Lambda _{3,KT}-3/2)R^2 ) \ \ :\ \ R^{-(n-1)}\Vert V_{t_2}\Vert (\phi _{P,R}^2)>\frac{1}{2} \mathbf{c}.\nonumber \\ \end{aligned}$$
(8.58)

First we show (8.57). Since \(M_0\) is \(C^1\), we may restrict \(r_2\) uniformly in x so that for all \(R=r/6\le r_2/6\), we have

$$\begin{aligned} R^{-(n-1)}\Vert V_0\Vert (\phi _{P,R}^2)\le R^{-(n-1)} \int _{P} \phi _{P,R}^2\, d{\mathcal H}^{n-1}+\frac{1}{10} \mathbf{c} =\frac{11}{10}\mathbf{c}. \end{aligned}$$
(8.59)

By (2.10), we have for \(t_1\in (3R^2/2,2R^2)\)

$$\begin{aligned} \Vert V_{t}\Vert (\phi _{P,R}^2)\Big |_{t=0}^{t_1}\le \int _0^{t_1}\int (-h\phi _{P,R}^2+\nabla \phi _{P,R}^2)\cdot (h+(u\cdot \nu )\nu )\, d\Vert V_t\Vert dt.\qquad \end{aligned}$$
(8.60)

By (2.5) and (2.6), we may replace \(\nabla \phi _{P,R}^2\) by \(S^{\perp }(\nabla \phi _{P,R}^2)\) for \(\Vert V_t\Vert \) a.e., where S is the approximate tangent space at the point. Since \(\nabla \phi _{P,R} =P(\nabla \phi _{P,R})\) (note \(\phi _{P,R}(x)=\phi _{P,R}(P(x))\) by definition), we have

$$\begin{aligned} S^{\perp }(\nabla \phi _{P,R}^2)=(I-S)\circ (P(\nabla \phi _{P,R}^2))=(P-S)\circ (P(\nabla \phi _{P,R}^2)). \end{aligned}$$
(8.61)

Thus, by using the Cauchy-Schwarz inequality to (8.60) and by (8.61), we obtain

$$\begin{aligned} \Vert V_{t}\Vert (\phi _{P,R}^2)\Big |_{t=0}^{t_1}&\le \int _0^{t_1} \int -\frac{1}{2} |h|^2\phi _{P,R}^2+2|u|^2\phi _{P,R}^2 \nonumber \\&\quad +8\Vert S-P\Vert ^2 |\nabla \phi _{P,R}|^2\, dV_t(\cdot ,S)dt. \end{aligned}$$
(8.62)

The first term on the right-hand side of (8.62) can be dropped. The second term can be estimated using the Hölder inequality as

$$\begin{aligned} \int _0^{t_1}\int 2|u|^2\phi _{P,R}^2\, d\Vert V_t\Vert dt&\le \int _0^{t_1}\left( \int |u|^s\, d\Vert V_t\Vert \right) ^{\frac{2}{s}}\,dt \cdot \sup _{t\in [0,t_1]}\Vert V_t\Vert (\phi _{P,R}^2)^{1-\frac{2}{s}} \nonumber \\&\le \Vert u\Vert _{L^{s,q}}^2 t_1^{1-\frac{2}{q}}\cdot \sup _{t\in [0,t_1]}\Vert V_t\Vert (\phi _{P,R}^2)^{1-\frac{2}{s}}. \end{aligned}$$
(8.63)

Due to the third claim of (8.56), \(\mathrm{spt}\, \Vert V_t\Vert \cap \mathrm{spt}\, \phi _{P,R}\subset B_{3R}\), for example. Thus we have \(\Vert V_t\Vert (\phi _{P,R}^2)\le D_1 \omega _{n-1}(3R)^{n-1}\). Since \(t_1\le 2R^2\), we obtain from (8.63)

$$\begin{aligned} \int _0^{t_1}\int 2|u|^2\phi _{P,R}^2\, d\Vert V_t\Vert dt\le c(D_1,n,s,q) \Vert u\Vert _{L^{s,q}}^2 R^{n-1+2\zeta }. \end{aligned}$$
(8.64)

For the third term of (8.62), we use [30, Lemma 11.2] (or [1, 8.13]), namely, for \(\phi =\phi _{P,R}\)

$$\begin{aligned}&\int \Vert S-P\Vert ^2\, |\nabla \phi |^2\, dV_t(\cdot ,S)\le 16 \int |P^{\perp }(x)|^2|\nabla |\nabla \phi ||^2\, d\Vert V_t\Vert \nonumber \\&\quad + 4\big (\int |h|^2|\nabla \phi |^2\, d\Vert V_t\Vert \big )^{\frac{1}{2}}\big (\int |P^{\perp }(x)|^2 |\nabla \phi |^2\,d\Vert V_t\Vert \big )^{\frac{1}{2}}. \end{aligned}$$
(8.65)

By repeating a similar argument leading to (8.62) with slightly larger test function which is 1 on \(\mathrm{spt}\, \phi _{P,R}\), one can obtain

$$\begin{aligned} \int _0^{t_1}\int |h|^2 |\nabla \phi _{P,R}|^2\, d\Vert V_t\Vert \le c(n) R^{n-3}. \end{aligned}$$
(8.66)

Since we have \(\mathrm{spt}\,\Vert V_t\Vert \cap \mathrm{spt}\,\phi _{P,R}\subset B_{3R}\) and by (8.51), we obtain

$$\begin{aligned} \int _0^{t_1}\int |P^{\perp }(x)|^2 |\nabla \phi _{P,R}|^2\, d\Vert V_t\Vert dt\le 3\varepsilon (3R)^{n+1} t_1 \sup |\nabla \phi _{P,R}|^2 \le c(n)\varepsilon R^{n+1}.\nonumber \\ \end{aligned}$$
(8.67)

Thus, by (8.66) and (8.67) and similarly estimating the last term, we obtain from (8.65) that

$$\begin{aligned} \int _0^{t_1}\int \Vert S-P\Vert ^2|\nabla \phi _{P,R}|^2\, dV_t(\cdot ,S)dt\le c(n)(\sqrt{\varepsilon }+\varepsilon )R^{n-1}. \end{aligned}$$
(8.68)

Combining (8.59), (8.62), (8.64) and (8.68), we obtain

$$\begin{aligned} R^{-(n-1)}\Vert V_{t_1}\Vert (\phi _{P,R}^2)\le \frac{11}{10}\mathbf{c}+c(D_1,n,s,q)\Vert u\Vert _{L^{s,q}}^2 R^{2\zeta }+c(n)(\sqrt{\varepsilon } +\varepsilon ).\quad \quad \end{aligned}$$
(8.69)

Thus, by restricting \(r<r_2\) and \(\varepsilon \) in (8.69), we can guarantee that (8.57) holds. To see (8.58) holds, we use the first two claims of (8.56). Due to the unit density property, recall that for a.e. t, we have \(\Vert V_t\Vert =\Vert \nabla \{\varphi (\cdot ,t)=1\}\Vert ={\mathcal H}^{n-1}\lfloor _{\partial ^*\{\varphi (\cdot ,t)=1\}}\), where \(\partial ^{*}A\) denotes the reduced boundary of A (see [24]). Let \(\nu _n\) be the \(x_n\) component of the inward pointing unit normal vector of \(\partial ^*\{\varphi (\cdot ,t)=1\}\). We apply the generalized divergence theorem valid for sets of finite perimeter, in this case, \(\{\varphi (\cdot ,t)=1\}\cap \{x_n\le r/3\}\). Then we have for a.e. \(t\in [0,\Lambda r^2]\)

$$\begin{aligned}&\int \phi _{P,R}^2\, d\Vert V_t\Vert \ge \int _{\partial ^*\{\varphi (\cdot ,t)=1\}} \nu _n \phi _{P,R}^2\, d{\mathcal H}^{n-1} \nonumber \\&\quad =-\int _{\{\varphi (\cdot ,t)=1\}\cap \{x_n\le r/3\}}\partial _{x_n}\phi _{P,R}^2\, dx+\int _{\{x_n=r/3\}} \phi _{P,R}^2\, d{\mathcal H}^{n-1}=R^{n-1}\mathbf{c}\qquad \quad \end{aligned}$$
(8.70)

since \(\phi _{P,R}^2\) does not depend on \(x_n\) and by the definition of \(\mathbf{c}\). In particular, we have proved (8.58). Now we are ready to apply [30, Theorem 8.7]. For all sufficiently small \(\varepsilon >0\), we have seen that we may choose \(r_2\) independent of \(x\in M_0\) such that all the assumptions of [30, Theorem 8.7] hold on \((B_{r/2}\times {\mathbb R})\times [0,\Lambda r^2]\) for all \(r\le r_2\). The conclusion is that in \(B_{\sigma _{KT} R}^{n-1}\times {\mathbb R}\) and for \(t\in ( (\Lambda _{3,KT}-1/4)R^2,(\Lambda _{3,KT}+1/4)R^2)\), \(\mathrm{spt}\,\Vert V_t\Vert \) is represented as a graph \(F(\cdot ,t)\) of \(C^{1,\zeta }\) function and it is \(C^{(1+\zeta )/2}\) in time, with \(|\nabla F|+R^{-1}|F|\) bounded by a constant multiple of \(\varepsilon \) (see (8.89) of [30]). The argument up to this point can be carried out for each point on \(x\in M_0\) uniformly and \(\mathrm{spt}\,\Vert V_t\Vert \) can be covered by such graphs. This shows that for all small \(t>0\), \(\mathrm{spt}\, \Vert V_t\Vert \) is \(C^{1,\zeta }\) everywhere. We have the local graph representation as claimed in (4) and \(t^{-1/2}\mathrm{dist}\, (\mathrm{spt}\,\Vert V_t\Vert , M_0) \rightarrow 0\) as \(t\rightarrow 0\). It is possible that \(\mathrm{spt}\,\Vert V_t\Vert \) remains \(C^{1,\zeta }\) for some more time, and let \(T_2\) be the maximal time without non-\(C^{1,\zeta }\) regular point. In case that u is \(\alpha \)-Hölder continuous, the regularity criterion are the same (see [46, Theorem 3.6]) except that the constant corresponding to \(\varepsilon _{6,KT}\) may need to be smaller there. Thus, in this case, there is a short initial time interval such that \(\mathrm{spt}\,\Vert V_t\Vert \) is a \(C^{2,\alpha }\) hypersurface. This ends the proof of (2) and (4). \(\square \)

9 Final remarks

9.1 Non-uniqueness

The solution may be non-unique without having singularities of \(M_t\), as a simple example demonstrates. An example such as \(M_0=\{x_2=0\}\subset {\mathbb T}^2\) and \(u(x_1,x_2)=(0,\sqrt{|x_2|})\in (W^{1,p}({\mathbb T}^2))^2\) (\(p<2\)) has an obvious ODE-level non-uniqueness. Thus, on top of the non-uniqueness issues generally associated with singularity occurrences of the MCF, one has far richer source of possible non-uniqueness with irregular u, even though we have a local regularity theory. It is interesting to investigate how generic the uniqueness may hold for the flow in this paper with respect to the initial data and the transport term. We mention that there is a nice generic property for the MCF besides the existence of unique viscosity solution. If \(M_0\) is \(C^2\) and \(d_0\) is the signed distance function to \(M_0\), then the viscosity solution for the MCF starting from \(\{d_0=s\}\) in the sense of [11, 16] is a unit density Brakke MCF for a.e. \(s\in (-r_0,r_0)\), where \(r_0>0\) is some small number depending on \(M_0\) [18]. For such level set, a phenomena called fattening does not occur in particular. It is interesting to see if there is some generalization of this type to the setting of this paper.

9.2 Structure of singularities

There have been intensive effort to understand the nature of singularities for the MCF in recent years. A particular emphasis has been placed on the mean convex flow and we mention names of Andrews, Huisken, Sinestrari and White who analyzed structure of singularities in depth. We mention a recent work by Haslhofer and Kleiner [25] for a streamlined treatment of the regularity theory of mean convex flows as well as up-to-date references. Note that many of the techniques used by White such as the dimension reducing and stratification of singularities [47] may be used for the flow in this paper. While there may be some limitation compared to the mean convex flow, it is interesting and challenging problem to investigate the singularities in the setting of the present paper.