1 Introduction

1.1 Modeling Background

It has long been established that body temperature is an indicator of health. Abnormalities in local body surface temperature have been recognized as a sign of disease for centuries. The modeling of heat related phenomena such as bio heat transfer is of great importance for the development of biomedical technologies, such as thermotherapy in treating diseases like tumor and injury involving skin tissue. Heating to the temperatures higher than that required to treat the diseased tissue can result in inadmissible damage to the adjoining healthy regions and insufficient heating can lead to under-treatment. The most commonly used model among many bio heat transfer models is the Pennes [27] bio heat model for simplicity and validity. Pennes bio heat transfer equation is based on classical Fourier’s law. Pennes model assumes that any thermal disturbance produced at a certain location will be felt throughout the medium at that instant. In fact biological tissue, along with a number of other common materials, exhibits a relatively long thermal lag time (e.g., [7, 22, 23, 33]). Due to implication of such relaxation time, heat conduction in biological media is generally not described by Fourier’s law, but rather by the Maxwell–Cattaneo law, known as thermal wave model [10, 11]. Although Maxwell–Cattaneo model has taken care of thermal relaxation time, the validity of the thermal wave model becomes debatable in view of the fast-transient response with microstructural interaction effects [32]. In order to consider the effect of micro-structural interaction in the fast transient process of heat transport, a phase lag for temperature gradient, \(\tau _T,\) which is absent in the Maxwell–Cattaneo model, has been introduced in [20, 32, 34, 38, 39]. The corresponding model is called the dual-phase-lag (DPL) model. Mathematically, DPL model is described by a time-dependent equation [38, 39]

$$\begin{aligned} \tau _q\rho c\frac{\partial ^2 T}{\partial t^2}= & {} k\nabla ^2 T+\tau _Tk\nabla ^2\frac{\partial T}{\partial t}-\omega _b\rho _b c_bT-(\tau _q\omega _b\rho _b c_b+\rho c)\frac{\partial T}{\partial t}\nonumber \\&+\big (\omega _b\rho _b c_bT_a+q_{\text{ met }}+q_{\text{ ext }}+\tau _q\frac{\partial q_{\text{ met }}}{\partial t}+\tau _q\frac{\partial q_{\text{ ext }}}{\partial t}\big ) \end{aligned}$$
(1.1)

where \(\rho , c, k \) are the density, specific heat and thermal conductivity of skin tissue, respectively; \(\rho _b, c_b\) are the density and specific heat of blood, \(\omega _b\) is the blood perfusion rate; \(T_a\) and T are the temperatures of arterial blood and skin tissue respectively; \(q_{\text{ met }}\) is the metabolic heat generation in the skin tissue and \(q_{\text{ ext }}\) is the heat source due to external heating, and \(\tau _q\) is defined as the thermal relaxation time.

1.2 Basic Notations

Throughout the work, we will follow the usual notation for Sobolev spaces and norms. For any domain \(\mathcal{M}\subset \Omega \subset \mathbb {R}^2,\) each integer \(k\ge 0\) and real p with \(1\le p \le \infty \), \(W^{k,p}(\mathcal{M})\) denotes the standard Sobolev space of functions with their weak derivatives of order up to k in the Lebesgue space \(L^p(\mathcal{M})\). When \(p=2\), we write \(H^k(\mathcal{M})\) for \(W^{k,2}(\mathcal M)\). We use \(\Vert \cdot \Vert _{s, \mathcal{M}}\) and \(|\cdot |_{s, \mathcal{M}}\) to denote the norm and seminorm in the Sobolev space \(H^s(\mathcal{M})\) for any \(s\ge 1\), respectively. The inner product in \(H^s(\mathcal{M})\) is denoted by \((\cdot , \cdot )_{s, \mathcal{M}}\). The space \(H^0(\mathcal{M})\) coincides with \(L^2(\mathcal{M})\), for which the norm and the inner product are denoted by \(\Vert \cdot \Vert _\mathcal{M}\) and \((\cdot , \cdot )_\mathcal{M}\), respectively. For simplicity of notation, we skip the subscript \(\mathcal{M}\) in the norm and inner product notation when \(\mathcal{M}=\Omega .\) \(H_{0}^{1}(\Omega )\) is a closed subspace of \(H^1(\Omega )\), which is also closure of \(C_0^{\infty } (\Omega )\) (the set of all \(C^{\infty }\) functions with compact support) with respect to the norm of \(H^s(\Omega )\) (cf. [1]).

We also define the standard Bôchner spaces \(L^2(J;\mathcal{B})\) and \(L^{\infty }(J; \mathcal{B})\), where \(\mathcal B\) is a real Banach space with norm \(\Vert .\Vert _\mathcal{B}\) and \(J=[0,T]\), consisting of all measurable functions \(\phi :J\rightarrow \mathcal{B}\) for which

$$\begin{aligned} \Vert \phi \Vert _{L^2(J; \mathcal{B})}:= & {} \Big (\int _0^T\Vert \phi (t)\Vert _\mathcal{B}^2dt\Big )^{\frac{1}{2}}< \infty \;\;\text{ and }\\ \Vert \phi \Vert _{L^{\infty }(J; \mathcal{B})}:= & {} \text{ ess }\sup _{t\in [0,T]}\Vert \phi (t)\Vert _\mathcal{B}< \infty , \end{aligned}$$

respectively. We denote by \(H^{m}(J;\mathcal{B})\), \(1\le m <\infty \), the space of all measurable functions \(\phi :J\rightarrow \mathcal{B}\) for which

For our notational convenience, we will be using \(\frac{\partial \phi }{\partial t}\) or \(\phi _t\) or \(\phi ^{\prime }\) interchangeably to denote time differentiation of \(\phi \). Similar remarks hold for other higher order time derivatives.

When no risk of confusion exists, we shall write \(L^{2}(\mathcal{B})\) for \(L^{2}(J;\mathcal{B})\), \(L^{\infty }(\mathcal{B})\) for \(L^{\infty }(J;\mathcal{B})\) and \(H^{m}(\mathcal{B})\) for \(H^{m}(J;\mathcal{B})\).

Fig. 1
figure 1

Domain \(\Omega \) and its subdomains \(\Omega _1\), \(\Omega _2\) with interface \(\Gamma \)

1.3 Problem Description

The goal of the present work is to study the following general linear second order hyperbolic equation

$$\begin{aligned} u^{\prime \prime }+\sigma u^\prime +\delta u-\nabla .(\epsilon \nabla u^\prime +\beta \nabla u)=f(x,t)\;\;\; \text{ in }\;\Omega \times (0,T],\; T<\infty \end{aligned}$$
(1.2)

with initial and boundary conditions

$$ \begin{aligned} u(x,0)=u_0,\;\;u^\prime (x,0)=v_0\;\;\text{ in }\; \Omega \;\; \& \;\; u(x,t)=0 \;\;\;\text{ on } ~\partial \Omega \times (0,T] \end{aligned}$$
(1.3)

where \(\Omega \) is a convex polygonal domain in \(\mathbb {R}^2\) with a Lipschitz boundary \(\partial \Omega \). Here, \(\sigma =\sigma (x),\;\delta =\delta (x),\;\epsilon =\epsilon (x),\;\beta =\beta (x)\) are non-negative real valued functions defined on \(\Omega \) and f denotes the source. In this work, it is implicitly assumed that initial data \((u_0, v_0)\) and the source function f are sufficiently smooth so that solution belongs to desired Sobolev spaces.

In realistic applications it is often possible to have heterogeneity of the underlying medium. In particular, media parameters may have jump discontinuities across interfaces in the domain of interest. As a model, we consider DPL bio heat transfer model (cf. [14, 20, 23, 33, 34, 38, 39] and references therein) in multi-layered media. Since the thermal properties of biological media vary between different layers, so, it is natural to have heterogeneity in the underlying media. Of our special interest is the case when the domain \(\Omega \) consists of two open subdomains \(\Omega _1\) and \(\Omega _2\) with \(C^2\) smooth interface \(\Gamma \), and physical coefficients are discontinuous and piecewise constants in \(\Omega \) (see, Fig. 1). We write

$$\begin{aligned} (\sigma , \delta , \epsilon , \beta )=\left\{ \begin{array}{ll} (\sigma _1,\delta _1,\epsilon _1,\beta _1) &{} \hbox {in}\quad \Omega _1, \\ (\sigma _2, \delta _2, \epsilon _2, \beta _2) &{} \hbox {in}\quad \Omega _2. \end{array} \right. \end{aligned}$$

The problem (1.2)–(1.3) is completed with the following physical interface conditions (cf. [20])

$$\begin{aligned}{}[u]=0,\;\;\Biggl [\epsilon (x){\partial u^\prime \over {\partial \mathbf{n}}}+\beta (x){\partial u \over {\partial \mathbf{n}}}\Biggr ] =0\;\;\;\text{ along }\; \Gamma \times [0,T], \end{aligned}$$
(1.4)

where \([u]=u_1|_\Gamma -u_2|_\Gamma \) and \( \Bigl [\epsilon (x){\partial u^\prime \over {\partial \mathbf{n}}}+\beta (x){\partial u \over {\partial \mathbf{n}}}\Bigr ]= \epsilon _1{\partial u^\prime _1\over {\partial \mathbf{n}_1}}+\epsilon _2{\partial u^\prime _2\over {\partial \mathbf{n}_2}} +\beta _1{\partial u_1\over {\partial \mathbf{n}_1}}+\beta _2{\partial u_2\over {\partial \mathbf{n}_2}}\) on \( \Gamma \). Here \(u_i\) stands for the restriction of u to \(\Omega _i\) and \({\partial \over \partial \mathbf{n}_i}\) denotes the outer normal derivative with respect to \(\Omega _i,\;i=1,2.\) The present work regards the temperature and the heat flux at the interface of two regions as continuous. In other words, the heat contact resistance at the interface between the two different media is neglected.

Interface problems are frequently encountered in scientific computing and many applied sciences. Typical examples are the elliptic, parabolic and hyperbolic equations with discontinuous coefficients. Due to the practical relevance of interface problems in many engineering and industrial applications, numerical methods for interface problems have been investigated widely. Finite element method (FEM) is another class of important approaches for interface problems and a wide variety of FEM approaches have been proposed in the literature. Classical finite element methods for interface problems are mainly based on the interface-fitted discretization. The performance of such kind of interface-fitted FEMs depends on the quality of underlying finite element partition and how well the interface is resolved by the finite element mesh (cf. [17]). A fitted finite element method is proposed and analyzed for the interface problem (1.2)–(1.4). The main contribution of the current work is to derive optimal order of convergence of the finite element solution of the BVP (1.2)–(1.4) in the \(L^{\infty }\)-in-time/\(L^2\)-in-space norm. The fully discrete scheme can be reinterpreted as the Crank–Nicolson discretization of the reformulation of the governing equation in the first-order system, as in Baker [4]. The derivation of the a priori error bound heavily depends on the approximation properties (cf. Lemma 3.6) of a newly introduced non-standard projection operator along with some new analytical tools and techniques, including a \(\lambda \)-strip argument for quantifying the relation of error near the interface in terms of the mismatch parameter \(\lambda \). There is plenty of literature available on the numerical study of the DPL bio heat model with discontinuous coefficients. One may refer to [20, 33, 34, 39] and references therein. However, to the best of our knowledge, finite element analysis for the general linear second order hyperbolic equation with discontinuous coefficients has not been studied earlier. In this work, we are providing both mathematical and numerical framework for the study of BVP (1.2)–(1.4). Convergence analysis, without the interface, for the general linear second order hyperbolic equation via finite element algorithm has been well studied in literature (cf. [5, 13, 16, 26] just to name a few). More recently, the spatial discretization of Westervelt’s quasi-linear strongly damped wave equation by piecewise linear finite elements has been discussed in [25]. A priori error analysis in [25] heavily depends on general linear wave model with time dependent coefficients. Optimal convergence in \(L^{\infty }(L^2)\) norm is obtained for sufficiently smooth solution. Fully discrete error analysis is still open for such problems. Our results are intended to enhance the numerical analysis of strongly damped linear wave equations where physical domain consists of heterogenous media.

The rest of the paper is organized as follows. In Sect. 2, we discuss the existence, uniqueness and regularity for the solution to the interface problem. Finite element discretization and some important theoretical results to be used in this article are discussed in Sect. 3. Section 4 is devoted to the error estimates for the semidiscrete scheme. The error analysis for the fully discrete scheme is presented in Sect. 5. Section 6 focuses on numerical examples. Finally, results are summarized in Sect. 7 with a brief outline on future work.

2 Preliminaries

This section is devoted to existence, uniqueness and regularity for the solutions to the model interface problem (1.2)–(1.4) in a convex polygonal domain \(\Omega \subset \mathbb {R}^2\) with a Lipschitz boundary \(\partial \Omega \). The solution to the interface problem has a higher regularity in each individual region than in the entire domain. This regularity result is critical for our further numerical analysis.

For the sake of brevity, we write \(W=L^2(\Omega )\), \(V=H_0^1(\Omega )\) with its dual space \(V^{\prime }=H^{-1}(\Omega )\) and \(\mathcal{X}=L^2(\Omega )\cap H^1(\Omega _1)\cap H^1(\Omega _2)\) equipped with norm \(\Vert v\Vert _\mathcal{X}:=\Vert v\Vert +\Vert v\Vert _{1,\Omega _1}+\Vert v\Vert _{1,\Omega _2}\). We also introduce two bilinear forms \(\mathcal{A}_\epsilon (\cdot , \cdot )\) and \(\mathcal{A}_\beta (\cdot , \cdot )\) on V as follows

$$\begin{aligned} \mathcal{A}_\epsilon (w,v)=\int _{\Omega }\epsilon \nabla w\cdot \nabla v dx=\mathcal{A}_\epsilon ^1(w,v)+\mathcal{A}_\epsilon ^2(w,v)\;\;\forall w, v\in V \end{aligned}$$
(2.1)

and

$$\begin{aligned} \mathcal{A}_\beta (w,v)=\int _{\Omega }\beta \nabla w\cdot \nabla v dx=\mathcal{A}_\beta ^1(w,v)+\mathcal{A}_\beta ^2(w,v)\;\;\forall w, v\in V. \end{aligned}$$
(2.2)

Here, \(\mathcal{A}_\epsilon ^l, \;\mathcal{A}_\beta ^l:H^1(\Omega _l)\times H^1(\Omega _l)\rightarrow \mathbb {R},\;l=1,\;2,\) are the local bilinear forms defined by

$$ \begin{aligned} \mathcal{A}_\epsilon ^l(w,v)=\int _{\Omega _l}\epsilon _l\nabla w\cdot \nabla v dx\;\; \& \;\; \mathcal{A}_\beta ^l(w,v)=\int _{\Omega _l}\beta _l\nabla w\cdot \nabla v dx\;\;\forall w, v\in H^1(\Omega _l). \end{aligned}$$

Further, we define bilinear forms \(\mathcal{B}_\sigma (\cdot ,\cdot ), \;\mathcal{B}_\delta (\cdot ,\cdot ):L^2(\Omega )\times L^2(\Omega )\rightarrow \mathbb {R}\) as

$$ \begin{aligned} \mathcal{B}_\sigma (w, v)=\int _{\Omega }^{}\sigma wvdx\;\; \& \;\; \mathcal{B}_\delta (w, v)=\int _{\Omega }^{}\delta wvdx\;\;\forall w, v\in L^2(\Omega ). \end{aligned}$$

Next, we define the weak form of our model problem (1.2)–(1.4). We adapt following notion of weak solution.

Definition 2.1

A function \(u\in H^1(V)\cap H^2(W)\) is called a weak solution of (1.2)–(1.3) if \(u(0)=u_0\) and \(u^\prime (0)=v_0\), and it satisfies following weak formulation

$$\begin{aligned} (u^{\prime \prime },v)+\mathcal{B}_\sigma ( u^\prime ,v)+\mathcal{B}_\delta (u,v)+\mathcal{A}_\epsilon (u^\prime ,v)+\mathcal{A}_\beta (u,v)=\langle f(t,\cdot ),v\rangle _{V^{\prime }\times V} \end{aligned}$$
(2.3)

for all \(v\in H^1_0(\Omega )\) and a.e. \(t\in (0,T].\) Here, \(\langle \cdot ,\cdot \rangle _{V^{\prime }\times V}\) denotes the standard duality product.

Existence and uniqueness of a solution to the variational problem (2.3) is proved in [12, 31, 35, 36], for instance, we refer to ([35], Theorem 3). For suitable initial data \((u_0, v_0)\) and forcing function f, we assume that weak solution \(u\in C^1([0, T]; V)\cap C^2([0, T]; W)\).

Remark 2.1

Apart from bio heat modeling, a substantial number of articles deal with model problem (2.3) and it can be applied to any system where elastic bodies interact, provided that the model problem is linear. Numerous examples can be found in [15, 18, 36], for example, viscous wave equation, networks of linked beams, hybrid chimney etc.

To deal with the strong solution to the interface problem, we introduce a Banach space

$$\mathcal{Y}:=H_0^1(\Omega )\cap H^2(\Omega _1) \cap H^2(\Omega _2)$$

equipped with the norm

$$\begin{aligned} \Vert v\Vert _\mathcal{Y}:=\Vert v\Vert _1+\Vert v\Vert _{2, \Omega _1}+\Vert v\Vert _{2, \Omega _2}. \end{aligned}$$

Definition 2.2

A function \(u\in H^1(\mathcal{Y})\cap H^2(W)\) is called a strong solution of (1.2)–(1.4) if \(u(0)=u_0\) and \(u^\prime (0)=v_0\) with jump conditions (1.4), and the relation

$$\begin{aligned}&u^{\prime \prime }(x,t)+\sigma (x)u^\prime (x,t)+\delta (x)u(x,t)-\nabla \cdot (\epsilon (x)\nabla u^\prime (x,t)+\beta (x)\nabla u(x,t))\nonumber \\&\quad =f(x,t) \end{aligned}$$
(2.4)

holds for a.e. \(t\in (0,T]\) and a.e. \(x\in \Omega _i\) \((i=1,\;2).\)

Before proving the existence of a strong solution to the interface problem, we first establish the following result.

Lemma 2.1

Let u be the weak solution of (1.2)–(1.3). Assume that \(u\in H^1(\mathcal{Y})\cap H^2(W)\), then u is a strong solution for (1.2)–(1.4).

Proof

For \(u\in H^1(\mathcal{Y})\cap H^2(W)\) and a.e. \(t\in (0,T]\), upon integration by parts, we obtain

$$\begin{aligned}&\int _{\Omega _i}-\nabla \cdot (\epsilon _i\nabla u^\prime +\beta _i\nabla u)v dx\nonumber \\&\quad =\int _{\Omega _i}\epsilon _i\nabla u^\prime \cdot \nabla vdx+\int _{\Omega _i}\beta _i\nabla u\cdot \nabla vdx\nonumber \\ \nonumber \\&\quad =(f-u^{\prime \prime }-\sigma _iu^\prime -\delta _i u,v)_{\Omega _i}\;\;\;\forall v\in H_0^1(\Omega _i), \end{aligned}$$
(2.5)

which implies that

$$\begin{aligned} -\nabla \cdot (\epsilon _i\nabla u^\prime (x,t)+\beta _i\nabla u(x,t))=f(x,t)-u^{\prime \prime }(x,t)-\sigma _iu^\prime (x,t)-\delta _i u(x,t) \end{aligned}$$

holds for a.e. \(t\in (0,T]\) and a.e. \(x\in \Omega _i\;(i=1,\;2).\) It remains to show that the weak solution also satisfies the jump conditions (1.4). Applying integration by parts, for a.e. \(t\in (0,T]\), we have

$$\begin{aligned} 0= & {} \sum _{i=1}^2\int _{\Omega _i}(u^{\prime \prime }+\sigma _i u^\prime +\delta _i u-f)v dx+\sum _{i=1}^2\int _{\Omega _i}-\nabla \cdot (\epsilon _i\nabla u^\prime +\beta _i\nabla u)v dx\nonumber \\= & {} \sum _{i=1}^2\int _{\Omega _i}(u^{\prime \prime }+\sigma _i u^\prime +\delta _i u-f)v dx+ \sum _{i=1}^2\int _{\Omega _i}(\epsilon _i\nabla u^\prime \cdot \nabla v+\beta _i\nabla u\cdot \nabla v)dx\nonumber \\&-\int _\Gamma \bigg [\epsilon \frac{\partial u^\prime }{\partial \mathbf{n}}+\beta \frac{\partial u}{\partial \mathbf{n}}\bigg ]vds\nonumber \\= & {} \sum _{i=1}^2(u^{\prime \prime }+\sigma _iu^\prime +\delta _i u-f,v)_{\Omega _i}+\sum _{i=1}^2(A_\epsilon ^i(u^\prime ,v)+A_\beta ^i(u,v))\nonumber \\&-\int _\Gamma \bigg [\epsilon \frac{\partial u^\prime }{\partial \mathbf{n}}+\beta \frac{\partial u}{\partial \mathbf{n}}\bigg ]vds\;\;\;\forall v\in V. \end{aligned}$$
(2.6)

Above relation and the definition of weak solution it follows that

$$\begin{aligned} \int _\Gamma \bigg [\epsilon \frac{\partial u^\prime }{\partial \mathbf{n}}+\beta \frac{\partial u}{\partial \mathbf{n}}\bigg ]vds=0\;\;\;\forall v\in V. \end{aligned}$$

The arbitrariness of v shows that u satisfies the second jump condition in (1.4). The first condition in (1.4) is a direct consequence of the fact that \(u\in H^1(V)\). This completes the proof. \(\square \)

In general, the solution u of the problem (1.2)–(1.4) does not belong to \(H^1(H^2(\Omega ))\) due to the presence of discontinuous coefficients. We can get better local regularity using local smoothness of the coefficients. From Lemma 2.1, it is clear that the existence of a strong solution depends on higher regularity of the weak solution, which is the main object of the Theorem 2.1. Further, a priori estimates for the solution to the problem (1.2)–(1.4) are also presented in Lemma 2.2 under appropriate regularity conditions on the initial functions \(u_0, v_0\) and source function f.

Theorem 2.1

Let \(u_0\in \mathcal{Y}\), \(v_0\in \mathcal{X}\) and \(f\in H^1(J;W)\), then the interface problem (1.2)–(1.4) admits a unique strong solution.

Proof

Let \(u\in C^1(J; V)\cap C^2(J; W)\) be a weak solution to the problem (1.2)–(1.3) satisfying (2.3). We consider following auxiliary problem: Find \(w\in H^1(J;\mathcal{Y})\) such that

$$\begin{aligned} \mathcal{A}_\epsilon (w^\prime , v)+ \mathcal{A}_\beta (w,v) = (f-u^{\prime \prime }-\sigma u^\prime -\delta u, v)\;\;\; \forall v\in V, \end{aligned}$$
(2.7)

with \([w]=0\) and \(\big [{\beta }\frac{\partial w}{\partial \mathbf{n}}+{\epsilon }\frac{\partial w^{\prime }}{\partial \mathbf{n}}\big ]=0\) across the interface \(\Gamma \), and \(w(x,0)=u_0\). For the existence and uniqueness of a solution to the problem (2.7), we refer to [2]. Further, \(w\in H^1(\mathcal Y)\) satisfies following a priori estimate

$$\begin{aligned} \Vert w\Vert _{H^1(\mathcal Y)}\le C(\Vert f-u^{\prime \prime }-\sigma u^\prime -\delta u\Vert _{L^2(J;W)}+\Vert u_0\Vert _\mathcal{Y}). \end{aligned}$$
(2.8)

Subtracting (2.7) from (2.3), we have

$$\begin{aligned} \mathcal{A}_\epsilon (u^\prime -w^\prime ,v)+\mathcal{A}_\beta (u-w,v)=0~~~\forall v\in V, \end{aligned}$$

which implies that \(w(x,t)=u(x,t)\) for a.e. \( t\in (0,T] \) and a.e. \( x\in \Omega \). Therefore \( u\in H^1(J;\mathcal Y) \) and due to Lemma 2.1 it is a strong solution to the interface problem (1.2)–(1.4). This completes the proof. \(\square \)

Remark 2.2

We are well aware of the fact that the rate of convergence of finite element approximations depends on the ‘smoothness’ of a solution. In Theorem 2.1, we have shown that interface problem (1.2)–(1.4) admits a unique strong solution \(u\in H^1(\mathcal{Y})\cap H^2(W)\) for appropriate initial data and source function. In fact, strong solution \(u\in H^1(J;\mathcal{Y})\cap C^1(J; V)\cap C^2(J; W)\). In articles on the finite element method for general second-order hyperbolic equations without the interface, related to convergence, it is assumed higher order time derivatives of the solutions ( cf. [5, 13, 16, 25, 26]). Therefore, we will be required additional regularity of u which guarantee the convergence results.

Lemma 2.2

Let \(u_0,\;v_0\in H^3(\Omega )\cap H_0^1(\Omega )\) and \(f\in H^1(J;H^1(\Omega ))\). Then the strong solution u to the interface problem (1.2)–(1.4) satisfies following a priori estimate

$$\begin{aligned} \Vert u\Vert _{H^2(\mathcal{Y})}\le C(\Vert u_0\Vert _{3}+\Vert v_0\Vert _{3}+\Vert f\Vert _{H^1(J;H^1(\Omega ))}). \end{aligned}$$

Proof

Suppose \(z\in H^1(J;\mathcal{Y})\cap C^1(J; V)\cap C^2(J; W)\) satisfies following variational formulation

$$\begin{aligned} (z^{\prime \prime }, v)+\mathcal{B}_\sigma (z^\prime , v)+\mathcal{B}_\delta (z, v)+\mathcal{A}_\epsilon (z^\prime , v)+\mathcal{A}_\beta (z, v)=(f^\prime , v)\;\;\forall v\in V \end{aligned}$$
(2.9)

with \(z(0)=v_0\) and \(z^\prime (0)=z_0\). Here, \(z_0\in \mathcal{X}\) is defined as

$$\begin{aligned} z_0=-\sigma _l v_0-\delta _l u_0+\nabla \cdot (\epsilon _l\nabla v_0 +\beta _l\nabla u_0)+f(0)\;\;\text{ in }\;\Omega _l,\;l=1, 2. \end{aligned}$$

Using the fact that \(u\in C^2(J; W)\), it is easy to verify that \((z_0-u^{\prime \prime }(0),v)=0\) for all \(v\in V.\)

Now, we define \(w(t)=u_0+\int _{0}^{t}zds,\;t\in [0, T]\) so that \(w(0)=u_0,\;w^\prime (0)=z(0)=v_0\) and \(w^{\prime \prime }(0)=z_0\). Further, for all \(v\in V\), we observe that w satisfies following equation

$$\begin{aligned} (w^{\prime \prime \prime }, v)+\mathcal{B}_\sigma (w^{\prime \prime }, v)+\mathcal{B}_\delta (w^{\prime }, v)+\mathcal{A}_\epsilon (w^{\prime \prime }, v)+\mathcal{A}_\beta (w^{\prime }, v)=(f^\prime , v)\;\;\forall v\in V \end{aligned}$$
(2.10)

which can be written as

$$\begin{aligned} \frac{d}{dt}\Big \{(w^{\prime \prime }, v)+\mathcal{B}_\sigma (w^\prime , v)+\mathcal{B}_\delta (w, v)+\mathcal{A}_\epsilon (w^\prime , v)+\mathcal{A}_\beta (w, v)-(f,v)\Big \}=0. \end{aligned}$$
(2.11)

Now, we differentiate (2.3) with respect to t to obtain

$$\begin{aligned} \frac{d}{dt}\Big \{(u^{\prime \prime },v)+\mathcal{B}_\sigma ( u^\prime ,v)+\mathcal{B}_\delta (u,v)+\mathcal{A}_\epsilon (u^\prime ,v)+\mathcal{A}_\beta (u,v)-(f,v)\Big \}=0, \end{aligned}$$
(2.12)

for all \(v\in V.\) For similar type of arguments in the context of wave equations, we refer to ([19], pages 95-98). Then subtracting (2.11) from (2.12) yields

$$\begin{aligned} \frac{d}{dt}\Big \{ (p^{\prime \prime },v)+\mathcal{B}_\sigma (p^\prime ,v)+\mathcal{B}_\delta (p,v) +\mathcal{A}_\epsilon (p^\prime ,v)+\mathcal{A}_\beta (p,v)\Big \}=0\;\;\forall v\in V, \end{aligned}$$

where \(p(t)=u(t)-w(t).\) Integrating the above equation from 0 to t, we derive

$$\begin{aligned} (p^{\prime \prime },v)+\mathcal{B}_\sigma (p^\prime ,v)+\mathcal{B}_\delta (p,v)+\mathcal{A}_\epsilon (p^\prime ,v)+\mathcal{A}_\beta (p,v)=0\;\forall v\in V. \end{aligned}$$
(2.13)

Note that \(p(0)=0\) and \(p^\prime (0)=0\), which implies that \(u=w\). Then use the fact \(u^{\prime }=w^{\prime }=z\in H^1(\mathcal{Y})\) to conclude that \(u\in H^2(\mathcal{Y})\). Further, for \(v=u^{\prime \prime }\), Eq. (2.3) yields

$$\begin{aligned} \int _{0}^{t}\Vert u^{\prime \prime }\Vert ^2ds \le C\Big (\sum _{l=1}^2\big \{\Vert u_0\Vert _{1, \Omega _l}^2+\Vert v_0\Vert _{1, \Omega _l}^2\big \}+\Vert f\Vert _{L^2(J;W)}^2\Big ), \end{aligned}$$

which together with (2.8) leads to following a priori estimate

$$\begin{aligned} \Vert u\Vert _{H^1(\mathcal{Y})} \le C(\Vert u_0\Vert _\mathcal{Y}+\Vert v_0\Vert _\mathcal{X}+\Vert f\Vert _{L^2(J;W)}). \end{aligned}$$
(2.14)

Using estimate (2.14) for z satisfying (2.9), we obtain

$$\begin{aligned} \Vert z\Vert _{H^1(\mathcal{Y})}\le & {} C(\Vert v_0\Vert _\mathcal{Y}+\Vert z^{\prime }(0)\Vert _\mathcal{X}+\Vert f^{\prime }\Vert _{L^2(J;W)})\nonumber \\\le & {} C(\Vert u_0\Vert _{3}+\Vert v_0\Vert _{3}+\Vert f\Vert _{H^1(J;H^1(\Omega ))}). \end{aligned}$$
(2.15)

This together with the fact that \(u^{\prime }=z\) leads to desired estimate. This completes the rest of the proof. \(\square \)

Remark 2.3

In the previous result, for \(u_0,\;v_0\in H^3(\Omega )\cap H_0^1(\Omega )\) and \(f\in H^1(J;H^1(\Omega ))\), we have shown that the strong solution u to the interface problem (1.2)–(1.4) belongs to \(H^2(J;\mathcal{Y})\cap C^2(J; V)\cap C^3(J; W)\). An argument similar to that of the preceding, and after having change the smoothness condition to \(u_0,\;v_0\in H^4(\Omega )\cap H_0^1(\Omega )\) and \(f\in H^2(J;H^1(\Omega ))\) leads to an improvement in the regularity of the strong solution u. More precisely, for sufficiently smooth initial data and source function f, we assume \(u\in H^3(J; \mathcal{Y})\). Results of this section are also hold true in a bounded and convex domain \(\Omega \subset \mathbb {R}^3\) with \(C^2\) smooth interface \(\Gamma \).

3 Finite Element Discretization

In this section, we describe a finite element discretization, introduce some auxiliary projections and prove their approximation properties.

For the purpose of finite element approximation of the problem (1.2)–(1.4), we now describe the discretization of domain \(\Omega \). The following discussion is borrowed from [17]. We assume that the family of triangulation \(\{\mathcal {T}_h\}_{h\in (0, h_0)}\), for some fixed \(h_0>0\), is quasi-uniform. We first approximate the domain \(\Omega _1\) by a polyhedral domain \(\Omega _{1,h}\) using a quasi-uniform mesh \(\mathcal{T}_h^1\) such that all the boundary vertices of \(\Omega _{1,h}\) lie on the boundary of \(\Omega _{1}.\) Let \(\Omega _{2,h}\) be the approximation for the domain \(\Omega _2\) due to a quasi-uniform triangulation \(\mathcal{T}_h^2\) with simplicial elements of size h. The triangulation \(\mathcal{T}_h^2\) is done such that all the vertices of the outer polyhedral boundary \(\partial \Omega \) are also the vertices of \(\Omega _{2,h}\), while all the vertices on the inner boundary of \(\Omega _{2,h}\) match the boundary vertices of \(\Omega _{1,h}\). More precisely, the triangulation \(\mathcal {T}_h:=\mathcal {T}_h^1\cup \mathcal {T}_h^2\) satisfies the following conditions:

  • \((\mathcal {A}1)\) \( \overline{\Omega }=\cup _{K\in T_h}K \),

  • \((\mathcal {A}2)\) if \( K_1, K_2\in \mathcal {T}_h \) and \( K_1\ne K_2 \), then either \( K_1\cap K_2=\emptyset \) or \( K_1\cap K_2 \) is a common vertex, an edge or a face,

  • \((\mathcal {A}3)\) for each K, all its vertices are completely contained in either \( \displaystyle \overline{\Omega }_1 \) or \( \overline{\Omega }_2 \).

Let \(V_h\) be a finite dimensional subspace of \(H_0^1(\Omega )\) defined on \(\mathcal {T}_h\) consisting of piecewise linear functions vanishing on the boundary \(\partial \Omega \). We now define a tubular neighborhood \(S_\lambda \) of \(\Gamma \) by

$$\begin{aligned} S_\lambda =\{x\in \Omega :\;\text {dist}(x,\Gamma )<\lambda \} \end{aligned}$$

for some \(\lambda >0\) with \(\lambda =\text{ O }(h^2)\) (cf. [17]). Existence of such \(\lambda \) is possible due to the fact that interface \(\Gamma \) is of class \(C^2\). A typical \(\lambda \)-strip is presented in Fig. 2. In fact the mesh \(\mathcal{T}_h\) can be decomposed into three disjoint subsets \(\mathcal{T}_h={\dot{\mathcal{T}}}_h^1\cup {\dot{\mathcal{T}}}_h^2\cup \mathcal{T}_{\star },\) where

$$\begin{aligned} \dot{\mathcal {T}}_h^i=\{K\in \mathcal {T}_h:\; K\subset \overline{\Omega }_i\backslash S_\lambda \},\;i=1,\;2, \end{aligned}$$

and \(\mathcal {T}_{\star }=\mathcal {T}_h\backslash (\dot{\mathcal {T}}_h^1\cup \dot{\mathcal {T}}_h^2).\) An element \(K\in \mathcal {T}_*\) is called an interface element and \( K\in \mathcal {T}_h\backslash \mathcal {T}_*\) is called a non-interface element. Further, for \(i=1,\;2,\) we define following disjoint collections of interface elements

$$\begin{aligned} \mathcal {T}_{\star }^i:=\{K\in \mathcal {T}_{\star }:\;K\subset \bar{\Omega }_{i}\cup S_{\lambda }\}. \end{aligned}$$

With above notations, we have

$$\begin{aligned} \Omega _{i,h}=\cup \{\overline{K}:\; K\in \dot{\mathcal {T}}_h^i\cup \mathcal {T}_*^i\}, \end{aligned}$$

so that for any \(K\in \mathcal{{T}}_h\), either \(K\subseteq \Omega _{1,h}\) or \(K\subseteq \Omega _{2,h}\).

Fig. 2
figure 2

An illustrative example of interface triangles K and S with \(\lambda \)-strip \(S_{\lambda }\)

In order to approximate \(\mathcal{A}_\epsilon (\cdot ,\cdot )\), \(\mathcal{A}_\beta (\cdot ,\cdot ) \), we now introduce approximate bilinear maps \( \mathcal{A}_{\epsilon h}, \mathcal{A}_{\beta h}: H^1(\Omega )\times H^1(\Omega )\rightarrow \mathbb {R} \) defined as

$$\begin{aligned} \mathcal{A}_{\epsilon h}(w,v)=\sum _{K\in \mathcal {T}_h}^{}\int _{K}^{}\epsilon _K(x)\nabla w.\nabla vdx\;\;\forall w, v\in H^1(\Omega ),\\ \mathcal{A}_{\beta h}(w,v)=\sum _{K\in \mathcal {T}_h}^{}\int _{K}^{}\beta _K(x)\nabla w.\nabla vdx\;\;\forall w, v\in H^1(\Omega ), \end{aligned}$$

with \(\epsilon _K(x)=\epsilon _i\) and \(\beta _K(x)=\beta _i\) if \(K\subset \Omega _{i,h},\;i=1, 2\).

Next, we approximate the bilinear maps \(\mathcal{B}_\sigma (\cdot ,\cdot )\), \(\mathcal{B}_\delta (\cdot ,\cdot )\) by \(\mathcal{B}_{\sigma h}(\cdot ,\cdot )\) and \(\mathcal{B}_{\delta h}(\cdot ,\cdot )\) respectively, defined as

$$\begin{aligned} \mathcal{B}_{\sigma h}(w,v)=\sum _{K\in \mathcal {T}_h}^{}\int _{K}^{}\sigma _K(x) w vdx\;\;\forall w,v\in L^2(\Omega ),\\ \mathcal{B}_{\delta h}(w,v)=\sum _{K\in \mathcal {T}_h}^{}\int _{K}^{}\delta _K(x) w vdx\;\;\forall w,v\in L^2(\Omega ), \end{aligned}$$

with \(\sigma _K(x)=\sigma _i\) and \(\delta _K(x)=\delta _i\) if \(K\subset \Omega _{i,h},\;i=1, 2.\)

For the simplicity of the exposition, we write \(\mathcal{A}\) for \(\mathcal{A}_\epsilon \) (or \( \mathcal{A}_\beta \)) and \(\mathcal{A}_h\) for \(\mathcal{A}_{\epsilon h}\) (or \(\mathcal{A}_{\beta h}\)), respectively. Similarly, we write \(\mathcal{B}\) for \(\mathcal{B}_\sigma \) (or \(\mathcal{B}_\delta \)) and \(\mathcal{B}_h\) for \(\mathcal{B}_{\sigma h}\) (or \(\mathcal{B}_{\delta h}\)), respectively. For the difference between the bilinear form \(\mathcal{A}\) (\(\mathcal{B}\)) and its approximated bilinear form \(\mathcal{A}_h\) (\(\mathcal{B}_h\)), we have the following results. For a Proof of Lemma 3.1, we refer to [17].

Lemma 3.1

For \( u,v\in H^1(\Omega ) \), we define \( \mathcal{A}_h^\Delta (u,v)=\mathcal{A}(u,v)-\mathcal{A}_h(u,v) \), then we have

$$\begin{aligned} |\mathcal{A}_h^\Delta (u,v)|\le C|u|_{1, S_\lambda }|v|_{1, S_\lambda }\;\;\forall u,v\in H^1(\Omega ). \end{aligned}$$

Lemma 3.2

For \(z\in H^1(\Omega )\), we have

$$\begin{aligned} |\mathcal{B}(z,v_h)-\mathcal{B}_h(z,v_h)|\le C h^2\Vert z\Vert _{1}\Vert v_h\Vert _{1}\;\;\forall v_h\in V_h. \end{aligned}$$
(3.1)

Further, for \(z\in \mathcal {Y}\) and \(v_h\in V_h\) there holds

$$\begin{aligned} |\mathcal{B}(z,v_h)-\mathcal{B}_h(z,v_h)|\le C (h^2+\lambda )\Vert z\Vert _\mathcal {Y}\Vert v_h\Vert . \end{aligned}$$
(3.2)

Proof

We define

$$\begin{aligned} \tilde{K}=\left\{ \begin{array}{ll} K\cap \Omega _1 &{} \text{ if }\;K\in \mathcal {T}_*^2,\\ K\cap \Omega _2 &{} \text{ if }\;K\in \mathcal {T}_*^1.\end{array}\right. \end{aligned}$$
(3.3)

Clearly, \(\tilde{K}\subset S_{\lambda }\cap \Omega _i,\;i=1,2.\) Then we get

$$\begin{aligned} |B(z,v_h)-B_h(z,v_h)|\le \sum _{K\in \mathcal {T}_*}^{}(z,v_h)_{\tilde{K}}\le C\sum _{K\in \mathcal {T}_*}^{}\Vert z\Vert _{\tilde{K}}\Vert v_h\Vert _{\tilde{K}}. \end{aligned}$$
(3.4)

Here, \(\Vert \cdot \Vert _{\tilde{K}}\) denotes the \(L^2\) norm over \(\tilde{K}.\) Now, using Hölder’s inequality, we obtain

$$\begin{aligned} \Vert z\Vert _{\tilde{K}}\le Ch^{\frac{3(p-2)}{2p}}\Vert z\Vert _{L^p(\tilde{K})}\;\;\forall \;p>2. \end{aligned}$$
(3.5)

We now recall Sobolev embedding inequality for two dimensions (cf. [29])

$$\begin{aligned} \Vert w\Vert _{L^p(K)}\le Cp^{\frac{1}{2}}\Vert w\Vert _{1,K}\;\;\forall w\in H^1(K),\;\;p>2. \end{aligned}$$
(3.6)

Now, setting \(p=6\) in (3.5) and then using the Sobolev embedding inequality (3.6), we obtain

$$\begin{aligned} \Vert z\Vert _{\tilde{K}}\le Ch\Vert z\Vert _{1, K}. \end{aligned}$$
(3.7)

Proceeding in a similar way, we obtain

$$\begin{aligned} \Vert v_h\Vert _{\tilde{K}}\le Ch\Vert v_h\Vert _{1, K}. \end{aligned}$$
(3.8)

Using estimates (3.7)–(3.8) in (3.4), we obtain the first inequality.

For the second inequality, let \(p\rightarrow \infty \) in (3.5) to have

$$\begin{aligned} \Vert z\Vert _{\tilde{K}}\le Ch^\frac{3}{2}\Vert z\Vert _{L^\infty (\tilde{K})}\le Ch^\frac{3}{2}\sum _{i=1}^{2}\Vert z\Vert _{L^\infty (\Omega _i)}\le Ch^\frac{3}{2}\Vert z\Vert _{2, \Omega _i}. \end{aligned}$$
(3.9)

In the last inequality, we have used standard Sobolev embedding inequality.

From Lemma 2.1 in [17] we then infer

$$\begin{aligned} \Vert v _h\Vert ^2_{S_{\lambda }\cap \Omega _i}\le C\lambda \Vert v_h\Vert _{\Omega _i}\Vert v_h\Vert _{1,\Omega _i},\;i=1,2. \end{aligned}$$
(3.10)

Finally, Poincaŕe inequality, inverse estimate \(\Vert \nabla v_h\Vert \le Ch^{-1}\Vert v_h\Vert \) together with (3.9)–(3.10) and (3.4) leads to

$$\begin{aligned} |B(z,v_h)-B_h(z,v_h)|\le & {} Ch^\frac{3}{2}\Vert z\Vert _{\mathcal {Y}}\bigg (\sum _{K\in \mathcal {T}_*}\Vert v_h\Vert _{\tilde{K}}^2\bigg )^{\frac{1}{2}}\nonumber \\\le & {} Ch^\frac{3}{2}\Vert z\Vert _{\mathcal {Y}}\bigg (\sum _{i=1}^2\Vert v_h\Vert _{S_{\lambda }\cap \Omega _i}^2\bigg )^{\frac{1}{2}}\nonumber \\\le & {} Ch^\frac{3}{2}\Vert z\Vert _{\mathcal {Y}}\bigg (\sum _{i=1}^2\lambda \Vert v_h\Vert _{\Omega _i}\Vert v_h\Vert _{1,\Omega _i}\bigg )^{\frac{1}{2}}\nonumber \\\le & {} C h^\frac{3}{2}\Vert z\Vert _{\mathcal {Y}}\sqrt{\lambda }\Vert v_h\Vert ^\frac{1}{2}\Vert v_h\Vert _{1, \Omega }^\frac{1}{2}\nonumber \\\le & {} C h\sqrt{\lambda }\Vert z\Vert _{\mathcal {Y}}\Vert v_h\Vert \le C(h^2+\lambda )\Vert z\Vert _{\mathcal {Y}}\Vert v_h\Vert . \end{aligned}$$

This completes the rest of the proof. \(\square \)

We, now, state following approximation result near interface. For a proof, we refer to [17].

Lemma 3.3

There exists a positive constant \(\mu \) independent of h such that

$$\begin{aligned} \Vert v_h\Vert _{H^1(S_\lambda )}\le C \sqrt{\frac{\lambda }{h}}\Vert v_h\Vert _{H^1(S_{\mu h})}\;\forall v_h\in V_h. \end{aligned}$$

Now, we introduce our elliptic projection operators \(\mathcal{Q}_{\epsilon h}\), \(\mathcal{Q}_{\beta h}:\mathcal{Y}\rightarrow V_h\) defined by

$$\begin{aligned} \mathcal{A}_{\epsilon h}(\mathcal{Q}_{\epsilon h}v,v_h)=\mathcal{A}_{\epsilon }^1(v,v_h)+\mathcal{A}_{\epsilon }^2(v,v_h)\;\forall v_h\in V_h \end{aligned}$$
(3.11)

and

$$\begin{aligned} \mathcal{A}_{\beta h}(\mathcal{Q}_{\beta h}v,v_h)=\mathcal{A}_{\beta }^1(v,v_h)+\mathcal{A}_{\beta }^2(v,v_h)\;\forall v_h\in V_h, \end{aligned}$$
(3.12)

respectively. To simplify the notation, we will write \(\mathcal{Q}_h\) in place of \(\mathcal{Q}_{\epsilon h}\) or \(\mathcal{Q}_{\beta h}\) when no risk of confusion arises.

Regarding the approximation properties of \(\mathcal{Q}_h\) operator defined by (3.11)–(3.12), we have following result (cf. [17])

Lemma 3.4

Let \(\mathcal{Q}_h\) be defined by (3.11) or (3.12). Then, for any \(v\in \mathcal{Y}\), there is a positive constant C independent of the mesh parameter h such that

$$\begin{aligned} \Vert \mathcal{Q}_hv-v\Vert +h\Vert \mathcal{Q}_hv-v\Vert _{1}\le C\Big (h+\sqrt{\lambda }+\frac{\lambda }{h}\Big )^2\Vert v\Vert _\mathcal{Y}. \end{aligned}$$

Let \(\mathcal{L}_h: L^2(\Omega )\rightarrow V_h\) be the standard \(L^2\) projection defined by

$$\begin{aligned} (\mathcal{L}_hv,\phi )=(v,\phi )\;\forall \phi \in V_h,\;v\in L^2(\Omega ). \end{aligned}$$
(3.13)

Previous result along with definition of \(L^2\) projection leads to the following error estimate.

Lemma 3.5

Let \(\mathcal{L}_h\) be defined by (3.13), then for any \(v\in \mathcal{Y}\) there is a positive constant C independent of the mesh parameter h such that

$$\begin{aligned} \Vert \mathcal{L}_hv-v\Vert +h\Vert \mathcal{L}_hv-v\Vert _1\le C\Big (h+\sqrt{\lambda }+\frac{\lambda }{h}\Big )^2\Vert v\Vert _\mathcal{Y}. \end{aligned}$$

Remark 3.1

Elliptic projection operator \(\mathcal{Q}_{\epsilon h}\) defined by (3.11) is also valid in the space \( \hat{\mathcal{X}}:=\{\xi \in \mathcal{X}: [\xi ]=0 \;\text{ along }\;\Gamma \; \& \;\xi =0\;\text{ on }\;\partial \Omega \}\) and satisfies following stability

$$\begin{aligned} \Vert \mathcal{Q}_{\epsilon h}v\Vert _1\le C\Vert v\Vert _\mathcal{X}\;\;\forall v\in \hat{\mathcal{X}}. \end{aligned}$$
(3.14)

Further, following approximation results hold

$$\begin{aligned} \Vert v-\mathcal{Q}_{\epsilon h}v\Vert +h\sum _{l=1}^2\Vert v-\mathcal{Q}_{\epsilon h}v\Vert _{1, \Omega _l}\le Ch^2\{\Vert v\Vert _{2, \Omega _1}+\Vert v\Vert _{2, \Omega _2}\}, \end{aligned}$$

for all \(v\in \hat{\mathcal{X}}\cap H^2(\Omega _1)\cap H^2(\Omega _2)\). Similar remarks hold for the elliptic projection \(\mathcal{Q}_{\beta h}\) and \(L^2\) projection \(\mathcal{L}_h\). For details, we refer to [8].

For given \(v\in \hat{\mathcal{X}}\), there exists \(w\in \mathcal{Y}\) (cf. [6]) satisfying

$$\begin{aligned} \sum _{l=1}^2\mathcal{A}_\epsilon ^l(w,\phi )=(v-\mathcal{Q}_{\epsilon h}v,\phi )\;\;\forall \phi \in \hat{\mathcal{X}}. \end{aligned}$$
(3.15)

Equation (3.15) together with (3.11), Lemmas 3.1 and 3.4 leads to

$$\begin{aligned} \Vert v-\mathcal{Q}_{\epsilon h}v\Vert ^2= & {} \sum _{l=1}^2\mathcal{A}_{\epsilon }^l(w-\mathcal{Q}_{\epsilon h}w, v-\mathcal{Q}_{\epsilon h}v) + \sum _{l=1}^2\mathcal{A}_{\epsilon }^l(\mathcal{Q}_{\epsilon h}w,v-\mathcal{Q}_{\epsilon h}v)\nonumber \\\le & {} C\Vert w-\mathcal{Q}_{\epsilon h}w\Vert _{1}\sum _{l=1}^2\Vert v-\mathcal{Q}_{\epsilon h}v\Vert _{1, \Omega _l}-\mathcal{A}_{\epsilon h}^\Delta (\mathcal{Q}_{\epsilon h}v,\mathcal{Q}_{\epsilon h}w)\nonumber \\\le & {} Ch\Vert w\Vert _\mathcal{Y}\sum _{l=1}^2\Vert v-\mathcal{Q}_{\epsilon h}v\Vert _{1, \Omega _l}+Ch\Vert \mathcal{Q}_{\epsilon h}v\Vert _{1}\Vert \mathcal{Q}_{\epsilon h}w\Vert _{1}\nonumber \\\le & {} Ch\Vert v-\mathcal{Q}_{\epsilon h}v\Vert \Vert v\Vert _\mathcal{X}. \end{aligned}$$
(3.16)

In the last inequality, we have used the fact that \(\Vert w\Vert _\mathcal{Y}\le C\Vert v-\mathcal{Q}_{\epsilon h}v\Vert \) and stability estimate (3.14). As a consequence of estimate (3.16), we obtain

$$\begin{aligned} \Vert v-\mathcal{L}_hv\Vert \le \Vert v-\mathcal{Q}_{\epsilon h}v\Vert \le Ch\Vert v\Vert _\mathcal{X}\;\forall v\in \hat{\mathcal{X}}. \end{aligned}$$
(3.17)

Here, we have used the fact that \(\mathcal{L}_hv\) is the best approximation of \(v\in L^2(\Omega )\) with respect to \(L^2\) norm.

Now, inverse inequality and estimates (3.16)–(3.17) lead to following stability for \(L^2\) projection

$$\begin{aligned} \Vert \mathcal{L}_hv\Vert _{1}\le & {} \Vert \mathcal{L}_hv-\mathcal{Q}_{\epsilon h}v\Vert _{1}+\Vert \mathcal{Q}_{\epsilon h}v\Vert _{1}\nonumber \\\le & {} Ch^{-1}\Vert \mathcal{L}_hv-\mathcal{Q}_hv\Vert +C\Vert v\Vert _\mathcal{X}\nonumber \\\le & {} C\Vert v\Vert _\mathcal{X}\;\forall v\in \hat{\mathcal{X}}. \end{aligned}$$
(3.18)

Remark 3.2

In Lemma 2.2, we have proved that the solution to the interface problem is sufficiently smooth in each individual subdomain \(\Omega _1\) and \(\Omega _2\) for smooth given data. Assuming \(u\in C^2(J; \mathcal{X})\) with \([u]=0\) along \(\Gamma \) and \(u=0\) on \(\partial \Omega \), we obtain

$$ \begin{aligned}{}[u^{\prime \prime }(t)]=0\;\text{ on }\; \Gamma \; \& \;u^{\prime \prime }(t)=0\;\text{ on }\;\partial \Omega \;\;\hbox {for} \;t\in [0, T]. \end{aligned}$$

This together with (3.18) yields

$$\begin{aligned} \Vert \mathcal{L}_hu^{\prime \prime }(0)\Vert _{1}\le C\Vert u^{\prime \prime }(0)\Vert _\mathcal{X}\le C(\Vert u_0\Vert _{3}+\Vert v_0\Vert _{3}+\Vert f\Vert _{H^1(H^1)}). \end{aligned}$$
(3.19)

Now, we are in a position to define our new non-standard projection operator which is crucial for our error analysis. For \(v\in H^1(J;\mathcal{Y})\), find \({\xi }_v\in H^1(J;V_h)\) such that for a.e. \(t\in [0,T]\)

$$\begin{aligned} \mathcal{A}_{\epsilon h}(\xi ^\prime _v(t),v_h)+\mathcal{A}_{\beta h}(\xi _v(t),v_h)=\mathcal{A}_{\epsilon }(v^\prime (t),v_h)+\mathcal{A}_{\beta }(v(t),v_h)\;\forall v_h\in V_h, \end{aligned}$$
(3.20)

with \(\xi _v(0)=\mathcal{Q}_{\beta h}v(0)\in V_h\).

One can follow the Proof of Theorem 3.1 and Theorem 3.2 in [9] to derive the following optimal point-wise-in time error estimates for the newly introduced projection operator.

Lemma 3.6

For any \(v\in H^1(J;\mathcal{Y})\) and a.e. \(t\in J\), there is a positive constant C independent of the mesh parameter h such that

$$\begin{aligned} \Vert v(t)-\xi _v(t)\Vert +h\Vert v(t)-\xi _v(t)\Vert _1\le C(h^2+\lambda )\Big (\Vert v\Vert _{H^1(\mathcal {Y})}+\Vert v(0)\Vert _\mathcal{Y}\Big ). \end{aligned}$$

4 Semidiscrete Finite Element Approximation

In this section, we discuss the semidiscrete finite element method for the problem (1.2)–(1.4) and derive optimal order error estimate in \(L^2\) norm.

The continuous time Galerkin finite element approximation to (2.3) is stated as follows: Find \(u_h\in C^2(J;V_h)\) such that

$$\begin{aligned}&(u^{\prime \prime }_h,v_h)+\mathcal{B}_{\sigma h}(u^\prime _h,v_h)+\mathcal{B}_{\delta h}(u_h,v_h)+\mathcal{A}_{\epsilon h}(u_h^\prime ,v_h)+\mathcal{A}_{\beta h}(u_h,v_h)\nonumber \\&\quad =(f,v_h)\;\forall v_h\in V_h,\;t\in (0, T], \end{aligned}$$
(4.1)

with \(u_h(0)=\mathcal{Q}_hu_0\) and \(u_h^\prime (0)=\mathcal{Q}_hv_0\).

Following result deals with the existence and regularity of \(u_h\). The basic technique is borrowed from [25].

Theorem 4.1

For each \(h\in (0, h_0)\), there exists a unique function \(u_h\in C^2(J; V_h)\) satisfying (4.1).

Proof

Let \(V_h\subset H_0^1(\Omega )\) be the finite element space defined on \(\mathcal {T}_h\) with basis functions \(\{\phi _i\}_{i=1}^{N_h} \). We consider Galerkin approximations in space

$$\begin{aligned} u_h(x,t)=\sum _{i=1}^{N_h}c_i(t)\phi _i(x) \end{aligned}$$

where \(c_i:(0,T]\rightarrow \mathbb {R}\) are coefficient functions for \(i\in [1, N_h]\).

We denote by \(\mathbf{c}_{h, 0}=[c_{1,0},...,c_{N_h,0}]^T\) and \(\mathbf{c}_{h,1}=[c_{1,1},\ldots ,c_{N_h,1}]^T\) the components of the given initial approximations \(u_h(0)\) and \(u_h^{\prime }(0)\), respectively. Then our semidiscrete problem is to find \(\mathbf{c}_{h}(t)=[c_{1}(t),\ldots ,c_{N_h}(t)]^T\), for \(t\in (0, T]\), such that

$$\begin{aligned} {\left\{ \begin{array}{ll} M_h\mathbf{c}_h^{\prime \prime }(t)+K_h\mathbf{c}_h^\prime (t)+L_h\mathbf{c}_h(t)+C_h\mathbf{c}_h^\prime (t)+D_h\mathbf{c}_h(t)=F_h(t),\\ \mathbf{c}_h(0)=\mathbf{c}_{h,0}\;\;\text{ and }\;\;\mathbf{c}_h^\prime (0)=\mathbf{c}_{h,1}. \end{array}\right. } \end{aligned}$$
(4.2)

Coefficient matrices are given by

$$\begin{aligned} M_h= & {} [M_{i,j}], \quad M_{i,j}=(\phi _i,\phi _j), \\ K_h= & {} [K_{i,j}], \quad K_{i,j}=\mathcal{B}_\sigma (\phi _i,\phi _j), \\ L_h= & {} [L_{i,j}], \quad L_{i,j}=\mathcal{B}_\delta (\phi _i,\phi _j), \\ C_h= & {} [C_{i,j}], \quad C_{i,j}=\mathcal{A}_\epsilon (\phi _i, \phi _j), \\ D_h= & {} [D_{i,j}], \quad D_{i,j}=\mathcal{A}_\beta (\phi _i, \phi _j) \end{aligned}$$

and the source term is given by \( F_h=[F_1,...,F_{N_h}]^T \), \( F_j=(f,\phi _j) \), with \( 1\le i,j\le N_h \). Note that the matrices and the right-hand-side vectors are all well-defined since

$$\begin{aligned}&|(\phi _i,\phi _j)|\le \Vert \phi _i\Vert \Vert \phi _j\Vert , \\&|\mathcal{B}( \phi _i,\phi _j)|\le C_1\Vert \phi _i\Vert \Vert \phi _j\Vert ,\\&|\mathcal{A}( \phi _i,\phi _j)|\le C_2\Vert \phi _i\Vert _1\Vert \phi _j\Vert _1, \\&|(f,\phi _j)|\le \Vert f\Vert \Vert \phi _j\Vert \le \Vert f\Vert _{L^{\infty }(L^2)}\Vert \phi _j\Vert , \end{aligned}$$

for all \(t\in J\). Furthermore, for any \(z\in \mathbb {R}^{N_h}\setminus {0}\), we have

$$\begin{aligned} z^TM_hz=\int _{\Omega }^{}\Big |\sum _{i=1}^{N_h}z_i\varphi _i\Big |^2dx\ge \Big |\sum _{i=1}^{N_h}z_i\varphi _i\Big |_{L^2}^2>0 \end{aligned}$$

for all \(t\in J\). Hence, the matrix \(M_h\) is invertible for all \(t\in J\) and the matrix equation in (4.2) can be rewritten as

$$\begin{aligned} \mathbf{c}_h^{\prime \prime }+M_h^{-1}K_h\mathbf{c}_h^\prime +M_h^{-1}L_h\mathbf{c}_h+M_h^{-1}C_h\mathbf{c}_h^\prime +M_h^{-1}D_h\mathbf{c}_h=M_h^{-1}F_h. \end{aligned}$$
(4.3)

Now the existence of a solution \(u_h\in C^2(J;V_h)\) follows from the standard ODE theory. This completes the rest of the proof.

Remark 4.1

Assuming \(f\in C^1(J; W)\) and setting

$$\begin{aligned} \mathbf{c}_h^{\prime \prime }(0)=M_h^{-1}F_h(0)-M_h^{-1}K_h\mathbf{c}_{h, 1}-M_h^{-1}L_h\mathbf{c}_{h, 0}-M_h^{-1}C_h\mathbf{c}_{h, 1}-M_h^{-1}D_h\mathbf{c}_{h, 0}, \end{aligned}$$

we further observe that \(u_h\in C^3(J; V_h).\) Next Lemma assumes \(f\in H^3(J; H^2(\Omega ))\) and which guarantee the existence of \(u_h\in C^4(J; V_h)\) satisfying (4.1).

Regarding the stability of \(u_h\) at the initial stage, we have the following result. For a proof, we refer to Appendix.

Lemma 4.1

Let \(u_h\) satisfy (4.1). Then, for \(i=2, 3, 4\), we have

$$\begin{aligned} \Vert D^i_tu_h(0)\Vert \le C\big ( \Vert u_0\Vert _{2i-2}+\Vert v_0\Vert _{2i-2}+\Vert f\Vert _{H^{i-1}(H^2)}\big ),\\ \Vert D^{i-1}_tu_h(0)\Vert _1\le C\big ( \Vert u_0\Vert _{2i-3}+\Vert v_0\Vert _{2i-3}+\Vert f\Vert _{H^{i-2}(H^1)}\big ), \end{aligned}$$

where \(D^i_t=\frac{\partial ^i}{\partial t^i}\).

Differentiating (4.1) twice with respect to t and substitute \(v_h=u_h^{\prime \prime \prime }\) to have

$$\begin{aligned}&\frac{1}{2}\frac{d}{dt}\Big \{\Vert u_h^{\prime \prime \prime }\Vert ^2+\mathcal{B}_{\delta h}(u_h^{\prime \prime },u_h^{\prime \prime })+\mathcal{A}_{\beta h}(u_h^{\prime \prime },u_h^{\prime \prime })\Big \}+\mathcal{B}_{\sigma h}(u_h^{\prime \prime \prime },u_h^{\prime \prime \prime })+\mathcal{A}_{\epsilon h}(u_h^{\prime \prime \prime },u_h^{\prime \prime \prime })\nonumber \\&\quad =(f^{\prime \prime },u_h^{\prime \prime \prime }). \end{aligned}$$
(4.4)

Integration from 0 to t and using standard arguments lead to

$$\begin{aligned}&\Vert u_h^{\prime \prime }(t)\Vert ^2+\Vert u_h^{\prime \prime \prime }(t)\Vert ^2+\Vert u_h^{\prime \prime }(t)\Vert _1^2+\int _{0}^{t}\Vert u_h^{\prime \prime \prime }\Vert ^2dt+\int _{0}^{t}\Vert u_h^{\prime \prime \prime }\Vert _1^2dt\nonumber \\&\quad \le C\Big (\Vert u_h^{\prime \prime }(0)\Vert ^2+\Vert u_h^{\prime \prime \prime }(0)\Vert ^2+\Vert u_h^{\prime \prime }(0)\Vert _1^2+\int _{0}^{t}\Vert f^{\prime \prime }\Vert ^2dt \Big ). \end{aligned}$$

Using Lemma 4.1 in the above equation, we get

$$\begin{aligned} \Vert u_h^{\prime \prime \prime }\Vert ^2+\Vert u_h^{\prime \prime }\Vert _1^2+\int _{0}^{t}\Vert u_h^{\prime \prime \prime }\Vert _1^2dt\le C\Big (\Vert u_0\Vert _4^2+\Vert v_0\Vert _4^2+\Vert f\Vert ^2_{H^2(H^2)} \Big ). \end{aligned}$$
(4.5)

Similarly, we obtain

$$\begin{aligned} \Vert u_h^{\prime \prime \prime \prime }\Vert ^2+\Vert u_h^{\prime \prime \prime }\Vert _1^2+\int _{0}^{t}\Vert u_h^{\prime \prime \prime \prime }\Vert _1^2dt\le C\Big (\Vert u_0\Vert _6^2+\Vert v_0\Vert _6^2+\Vert f\Vert ^2_{H^3(H^2)} \Big ). \end{aligned}$$
(4.6)

Now, we prove the convergence result for the semidiscrete scheme in \(L^\infty (L^2)\) norm.

Theorem 4.2

Let u and \(u_h\) be the solutions of problems (1.2)–(1.4) and (4.1), respectively. Then, for \(u_0,\; v_0 \in \mathcal{Y}\) and \(f\in L^2(J;W)\), we have

$$\begin{aligned} \Vert u-u_h\Vert _{L^\infty (J;L^2(\Omega ))}\le C(u)\bigg (h+\sqrt{\lambda }+\frac{\lambda }{h}\bigg )^2, \end{aligned}$$
(4.7)

where \(C(u):=C\bigg \{\Vert u_0\Vert _\mathcal{Y}^2+\Vert v_0\Vert _\mathcal{Y}^2+\Vert u\Vert _{H^1(\mathcal{Y})}\bigg \}^{\frac{1}{2}}.\)

Proof

Define the error e(t) as \(e(t):=u(t)-u_h(t)\) and then subtracting (2.3) from (4.1) with some natural rearrangements, we obtain

$$\begin{aligned}&(e^{\prime \prime },v_h)+ \mathcal{B}_{\sigma h}(e^\prime ,v_h)+\mathcal{B}_{\delta h}(e,v_h)+\mathcal{A}_{\epsilon h}(e^\prime ,v_h)+\mathcal{A}_{\beta h}(e,v_h)~~~~~\nonumber \\&\quad = -\mathcal{B}_{\sigma h}^\Delta (u^\prime ,v_h)-\mathcal{B}_{\delta h}^\Delta (u,v_h)-\mathcal{A}_{\epsilon h}^\Delta (u^\prime ,v_h)-\mathcal{A}_{\beta h}^\Delta (u,v_h)\;\forall v_h\in V_h.~~~~~ \end{aligned}$$
(4.8)

Now, we split e(t) into standard \(\rho \) and \(\theta \) as

$$\begin{aligned} e=\rho +\theta ,\;\rho :=u-\xi _u,\;\theta :=\xi _u-u_h, \end{aligned}$$
(4.9)

where \(\xi _u\) is the projection operator defined as in (3.20).

Then Eq. (4.8) reduces to

$$\begin{aligned}&(\theta ^{\prime \prime },v_h)+\mathcal{B}_{\sigma h}(\theta ^\prime ,v_h)+\mathcal{B}_{\delta h}(\theta ,v_h)+ \mathcal{A}_{\epsilon h}(\theta ^\prime ,v_h)+\mathcal{A}_{\beta h}(\theta ,v_h)~~~~\nonumber \\&\quad =-(\rho ^{\prime \prime },v_h)-\mathcal{B}_{\sigma h}(\rho ^\prime ,v_h)-\mathcal{B}_{\delta h}(\rho ,v_h)-\mathcal{A}_{\epsilon h}(\rho ^\prime ,v_h)-\mathcal{A}_{\beta h}(\rho ,v_h) \nonumber \\&\;\;\;\;-\mathcal{B}_{\sigma h}^\Delta (u^\prime ,v_h)-\mathcal{B}_{\delta h}^\Delta (u,v_h)-\mathcal{A}_{\epsilon h}^\Delta (u^\prime ,v_h)-\mathcal{A}_{\beta h}^\Delta (u,v_h)\;\forall v_h\in V_h.~~~~~ \end{aligned}$$
(4.10)

Using the definition of \(\xi _u\), we observe that

$$\begin{aligned}&\mathcal{A}_{\epsilon h}(\rho ^\prime ,v_h)+\mathcal{A}_{\beta h}(\rho ,v_h)\\&\quad =\mathcal{A}_{\epsilon h}(u^\prime ,v_h)+\mathcal{A}_{\beta h}(u,v_h)-\big \{\mathcal{A}_{\epsilon h}(\xi ^\prime _u,v_h)+\mathcal{A}_{\beta h}(\xi _u,v_h)\big \} \\&\quad =\mathcal{A}_{\epsilon h}(u^\prime ,v_h)+\mathcal{A}_{\beta h}(u,v_h)-\big \{\mathcal{A}_{\epsilon }(u^\prime ,v_h)+\mathcal{A}_{\beta }(u,v_h)\big \}. \end{aligned}$$

Above equation together with (4.10) leads to

$$\begin{aligned}&(\theta ^{\prime \prime },v_h)+\mathcal{B}_{\delta h}(\theta ,v_h)+\mathcal{A}_{\epsilon h}(\theta ^\prime ,v_h)+\mathcal{A}_{\beta h}(\theta ,v_h)\nonumber \\&\quad =-(\rho ^{\prime \prime },v_h)-\mathcal{B}_{\sigma h}(e^\prime ,v_h)-\mathcal{B}_{\delta h}(\rho ,v_h)\nonumber \\&\;\;\;\;-\mathcal{B}_{\sigma h}^\Delta (u^\prime ,v_h)-\mathcal{B}_{\delta h}^\Delta (u,v_h)\;\forall v_h\in V_h, \end{aligned}$$
(4.11)

which can be rewritten as

$$\begin{aligned}&\frac{d}{dt}(\theta ^\prime ,v_h)-(\theta ^\prime ,v_h^\prime )+\mathcal{B}_{\delta h}(\theta ,v_h)+\frac{d}{dt}\mathcal{A}_{\epsilon h}(\theta ,v_h)-\mathcal{A}_{\epsilon h}(\theta ,v_h^\prime )+\mathcal{A}_{\beta h}(\theta ,v_h)\nonumber \\&\quad =-\frac{d}{dt}(\rho ^\prime ,v_h)+(\rho ^\prime ,v_h^\prime )-\frac{d}{dt}\mathcal{B}_{\sigma h}(e,v_h)+\mathcal{B}_{\sigma h}(e,v_h^\prime )-\mathcal{B}_{\delta h}(\rho ,v_h)\nonumber \\&\;\;\;\;-\mathcal{B}_{\sigma h}^\Delta (u^\prime ,v_h)-\mathcal{B}_{\delta h}^\Delta (u,v_h)\;\forall v_h\in V_h. \end{aligned}$$
(4.12)

Following Baker [4], we define \( \hat{v}: [0,T]\times \Omega \rightarrow \mathbb {R} \) as

$$\begin{aligned} \hat{v}(.,t)=\int _{t}^{\zeta }\theta (.,s)ds\;,\;\;\;0\le t\le T, \end{aligned}$$

for some fixed \(\zeta \in [0,T]\). Then, clearly \( \hat{v}\in V_h \) as \( \theta =\xi _u-u_h\in V_h \). Also, observe that

$$\begin{aligned} \hat{v}(.,\zeta )=0\;\;\text{ and }\;\;\frac{d}{dt}\hat{v}(.,t)=-\theta (.,t)\;,\;\;0\le t\le T. \end{aligned}$$
(4.13)

Setting \(v_h=\hat{v}\) in (4.12) and making some rearrangements, we obtain

$$\begin{aligned}&\frac{d}{dt}(\theta ^\prime ,\hat{v})+\frac{1}{2}\frac{d}{dt}(\theta ,\theta )+\mathcal{B}_{\sigma h}(\theta ,\theta )-\frac{1}{2}\frac{d}{dt}\mathcal{B}_{\delta h}(\hat{v},\hat{v})\\&\qquad +\frac{d}{dt}\mathcal{A}_{\epsilon h}(\theta ,\hat{v})+\mathcal{A}_{\epsilon h}(\theta ,\theta )-\frac{1}{2}\frac{d}{dt}\mathcal{A}_{\beta h}(\hat{v},\hat{v})\\&\quad =-\frac{d}{dt}(\rho ^\prime ,\hat{v})-(\rho ^\prime ,\theta )-\frac{d}{dt}\mathcal{B}_{\sigma h}(e,\hat{v})-\mathcal{B}_{\sigma h}(\rho ,\theta )-\mathcal{B}_{\delta h}(\rho ,\hat{v})\\&\qquad \;\;\;\;-\mathcal{B}_{\sigma h}^\Delta (u^\prime ,\hat{v})-\mathcal{B}_{\delta h}^\Delta (u,\hat{v}). \end{aligned}$$

Integrating from 0 to \( \zeta \) and using \( \hat{v}(\zeta )=0 \), we get

$$\begin{aligned}&-(\theta ^\prime (0),\hat{v}(0))+\frac{1}{2}\Vert \theta (\zeta )\Vert ^2-\frac{1}{2}\Vert \theta (0)\Vert ^2+\int _{0}^{\zeta }\mathcal{B}_{\sigma h}(\theta ,\theta )ds+\frac{1}{2}\mathcal{B}_{\delta h}(\hat{v}(0),\hat{v}(0))\nonumber \\&\qquad -\mathcal{A}_{\epsilon h}(\theta (0),\hat{v}(0))+\int _{0}^{\zeta }\mathcal{A}_{\epsilon h}(\theta ,\theta )ds+\frac{1}{2}\mathcal{A}_{\beta h}(\hat{v}(0),\hat{v}(0))\nonumber \\&\quad =(\rho ^\prime (0),\hat{v}(0))-\int _{0}^{\zeta }(\rho ^\prime ,\theta )ds+\mathcal{B}_{\sigma h}(e(0),\hat{v}(0))-\int _{0}^{\zeta }\mathcal{B}_{\sigma h}(\rho ,\theta )ds\nonumber \\&\quad \;\;\;\;-\int _{0}^{\zeta }\mathcal{B}_{\delta h}(\rho ,\hat{v})ds-\int _{0}^{\zeta }\mathcal{B}_{\sigma h}^\Delta (u^\prime ,\hat{v})ds-\int _{0}^{\zeta }\mathcal{B}_{\delta h}^\Delta (u,\hat{v})ds. \end{aligned}$$
(4.14)

Observe that \(\theta (0)=\xi _u(0)-u_h(0)=\mathcal{Q}_hu(0)-\mathcal{Q}_hu_0=0\), hence (4.14) becomes

$$\begin{aligned}&\frac{1}{2}\Vert \theta (\zeta )\Vert ^2+\int _{0}^{\zeta }\Vert \theta \Vert ^2ds+\int _{0}^{\zeta }\Vert \theta \Vert _1^2ds+\frac{1}{2}\Vert \hat{v}(0)\Vert _1^2\nonumber \\&\quad \le (e^\prime (0),\hat{v}(0))-\int _{0}^{\zeta }(\rho ^\prime ,\theta )ds+\mathcal{B}_{\sigma h}(e(0),\hat{v}(0))-\int _{0}^{\zeta }\mathcal{B}_{\sigma h}(\rho ,\theta )ds\nonumber \\&\qquad \;\;\;\;-\int _{0}^{\zeta }\mathcal{B}_{\delta h}(\rho ,\hat{v})ds-\int _{0}^{\zeta }\mathcal{B}_{\sigma h}^\Delta (u^\prime ,\hat{v})ds-\int _{0}^{\zeta }\mathcal{B}_{\delta h}^\Delta (u,\hat{v})ds. \end{aligned}$$
(4.15)

Then Cauchy-Schwartz inequality, Lemma 3.2 and continuity of \(\mathcal{B}_{h}\) operator leads to

$$\begin{aligned}&\frac{1}{2}\Vert \theta (\zeta )\Vert ^2+\int _{0}^{\zeta }\Vert \theta \Vert ^2ds+\int _{0}^{\zeta }\Vert \theta \Vert _1^2ds+\frac{1}{2}\Vert \hat{v}(0)\Vert _1^2\nonumber \\&\quad \le C\Bigg ( \Vert e^\prime (0)\Vert \Vert \hat{v}(0)\Vert +\int _{0}^{\zeta }\Vert \rho ^\prime \Vert \Vert \theta \Vert ds+\Vert e(0)\Vert \Vert \hat{v}(0)\Vert +\int _{0}^{\zeta }\Vert \rho \Vert \Vert \theta \Vert ds\nonumber \\&\;\;\;\;\;\;\;\;\;+\int _{0}^{\zeta }\Vert \rho \Vert \Vert \hat{v}\Vert ds+\bigg (h+\sqrt{\lambda }+\frac{\lambda }{h}\bigg )^2\int _{0}^{\zeta }\Vert u^\prime \Vert _\mathcal{Y}\Vert \hat{v}\Vert ds\nonumber \\&\;\;\;\;\;\;\;\;\;+\bigg (h+\sqrt{\lambda }+\frac{\lambda }{h}\bigg )^2\int _{0}^{\zeta }\Vert u\Vert _\mathcal{Y}\Vert \hat{v}\Vert ds\Bigg ). \end{aligned}$$
(4.16)

Since \(\theta \) is continuous in the time variable, we select \(\zeta \) such that \(\Vert \theta (\zeta )\Vert =\max _{0\le t\le T}\Vert \theta (t)\Vert \). Then we observe that \(\Vert \hat{v}(t)\Vert \le C(T)\Vert \theta (\zeta )\Vert , \;t\in [0, T]\), which in combination with (4.16) leads to

$$\begin{aligned} \Vert \theta (\zeta )\Vert\le & {} C\Bigg ( \Vert e^\prime (0)\Vert +\Vert e(0)\Vert +\int _{0}^{\zeta }(\Vert \rho ^\prime \Vert +\Vert \rho \Vert )ds\\&\;\;\;\;+\bigg (h+\sqrt{\lambda }+\frac{\lambda }{h}\bigg )^2\int _{0}^{\zeta }(\Vert u^\prime \Vert _\mathcal{Y}+\Vert u\Vert _\mathcal{Y})ds\Bigg ). \end{aligned}$$

This together with Lemma 3.6 leads to Theorem 4.2. \(\square \)

Remark 4.2

Theorem 4.2 is an extension of Theorem 5.6 in [25] to general linear hyperbolic equation with interface. It is worth to note that Theorem 5.6 in [25] is concerned on the convergence of finite element solution to the exact solution of linearized Westervelt equation with variable coefficients without interface.

5 Fully Discrete Scheme

This section is dedicated to the derivation of the \(L^2\) norm error estimate. The basic technique used here is borrowed from Baker [4] with a modification to include the damping term.

First we divide the time interval \(I=[0,T]\) into N equally spaced subintervals \(I_n=(t_{n-1},t_n]\), \(n=1,2,\ldots ,N\) with \(t_0=0\), and \(t_N=T\) and \(\tau =t_n-t_{n-1}\), the time step. For a sequence \(\{p^n\}_{n=0}^N \subset L^2(\Omega )\), we define

$$\begin{aligned} \partial _\tau p^n=\frac{p^{n+1}-p^n}{\tau }\;\;\;\;\text{ and }\;\;\;\;p^{n+\frac{1}{2}}=\frac{1}{2}(p^{n+1}+p^n),\;\;n=0,1,\ldots ,N-1. \end{aligned}$$

Also, for a continuous mapping \(\phi :[0,T]\rightarrow L^2(\Omega )\), we define \(\phi ^n=\phi (.,t_n)\), \( 0\le n\le N \). Then the fully discrete finite element approximation to the problem (1.2)–(1.4) is defined as follows: Find \(U^n\in V_h\) such that

$$\begin{aligned} \partial _\tau U^n= p^{n+\frac{1}{2}}\;\;\text{ for }\;\;n=0,\;1,\ldots ,\;N-1 \end{aligned}$$
(5.1)

and

$$\begin{aligned}&(\partial _\tau p^n,\psi )+\mathcal{B}_{\sigma h}(p^{n+\frac{1}{2}},\psi )+\mathcal{B}_{\delta h}( U^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\epsilon h}(p^{n+\frac{1}{2}},\psi )+ \mathcal{A}_{\beta h}(U^{n+\frac{1}{2}},\psi )\nonumber \\&\quad =(f^{n+\frac{1}{2}},\psi )\;\;\forall \psi \in V_h, \end{aligned}$$
(5.2)

with \(U^0=\mathcal{Q}_hu_0\) and \(p^0=\mathcal{Q}_hv_0\).

The following Lemma gives the existence and uniqueness of the fully discrete solution \(U^n\) of \(u^n\) in terms of the auxiliary variable \(p^n\) and in fact gives a computational algorithm to find \(U^n\) .

Lemma 5.1

There exists a unique sequence \(\{U^n\}_{n=0}^N\subset V_h\) and a corresponding unique sequence \(\{p^n\}_{n=0}^N\subset V_h\) satisfying (5.1)–(5.2).

Proof

From (5.1), we have

$$\begin{aligned} U^{n+1}=\frac{\tau }{2}(p^{n+1}+p^n)+U^n. \end{aligned}$$
(5.3)

Using (5.3) in (5.2), we get

$$\begin{aligned} \mathcal{A}_\tau (p^{n+1},\psi )=\mathcal{F}^n(\psi )\;\;\forall \psi \in V_h, \end{aligned}$$
(5.4)

where \(\mathcal{A}_\tau \) is the bilinear form given by

$$\begin{aligned} \mathcal{A}_\tau (w,v)= & {} (w,v)+\frac{\tau }{2}\mathcal{B}_{\sigma h}(w,v)+\frac{\tau ^2}{4}\mathcal{B}_{\delta h}(w,v)\\&+\frac{\tau }{2}\mathcal{A}_{\epsilon h}(w,v)+\frac{\tau ^2}{4}\mathcal{A}_{\beta h}(w,v)\;\;\forall w, v\in V \end{aligned}$$

and \(\mathcal{F}^n\) is the linear functional given by

$$\begin{aligned} \mathcal{F}^n(\psi )= & {} (p^n,\psi )-\frac{\tau }{2}\mathcal{B}_{\sigma h}(p^{n},\psi )-\tau \mathcal{B}_{\delta h}(U^n,\psi )-\frac{\tau ^2}{4}\mathcal{B}_{\delta h}(p^{n},\psi )\\&-\frac{\tau }{2}\mathcal{A}_{\epsilon h}(p^{n},\psi )-\frac{\tau ^2}{4}\mathcal{A}_{\beta h}(p^{n},\psi )-\tau \mathcal{A}_{\beta h}(U^{n},\psi )+\tau (f^{n+\frac{1}{2}},\psi )\;\forall \psi \in V. \end{aligned}$$

Due to the positivity of bilinear forms \(\mathcal{B}_{h}\) and \(\mathcal{A}_{h}\), there exists uniquely defined \(p^{n+1}\in V_h\) satisfying equation (5.4) and subsequently \(U^{n+1}\) exists uniquely for \(n=0,1,\ldots ,N-1 \). \(\square \)

Later on, we will need the following results. The proofs involve the use of Taylor’s series and standard arguments, and therefore, details are omitted.

Lemma 5.2

For any \(v\in H^3(J;L^2(\Omega ))\), we have

$$\begin{aligned} \Vert \partial _\tau v^n-v_t^{n+\frac{1}{2}}\Vert ^2\le C\tau ^3\int _{t_n}^{t_{n+1}}\Vert v^{\prime \prime \prime }\Vert ^2dt. \end{aligned}$$

In order to compute the error between \(U^n\) and \(u^n\), it suffices to establish the error \(\omega ^n:=u_h^n-U^n\), for \(1\le n\le N\) and \(u_h^n=u_h(\cdot , t_n)\). Once we have estimate for \(\omega ^n\), we can easily get the error estimate for \(e^n:=U^n-u^n\) by using the triangle inequality, Theorem 4.2 and the Lemma 5.3 given below.

Lemma 5.3

Let u and \(U^n\) be the solutions of the interface problem (1.2)–(1.4) and the finite element approximation (5.1)–(5.2), respectively. Then, we have

$$\begin{aligned} \max _{1\le n\le N}\Vert \omega ^n\Vert ^2\le C \tau ^4\Big (\int _{0}^{T}\Vert u_{h}^{\prime \prime \prime \prime }\Vert ^2dt+\int _{0}^{T}\Vert u_{h}^{\prime \prime \prime }\Vert _1^2dt\Big ). \end{aligned}$$

Proof

Substitute \(t=t_n\) and \(t=t_{n+1}\) in (4.1) and then add to have

$$\begin{aligned}&(\partial _\tau u_{ht}^n,\psi )+\mathcal{B}_{\sigma h}(u_{ht}^{n+\frac{1}{2}},\psi )+\mathcal{B}_{\delta h}(u_h^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\epsilon h}(u_{ht}^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\beta h}(u_h^{n+\frac{1}{2}},\psi )\nonumber \\&\quad =(f^{n+\frac{1}{2}},\psi )+(\rho ^n,\psi )\;\;\forall \psi \in V_h,~~~ \end{aligned}$$
(5.5)

where \(\rho ^n:=\partial _\tau u_{ht}^n-u_{htt}^{n+\frac{1}{2}}\).

Now, subtracting (5.2) from (5.5), we have

$$\begin{aligned}&(\partial _\tau q^n,\psi )+\mathcal{B}_{\sigma h}(q^{n+\frac{1}{2}},\psi )+\mathcal{B}_{\delta h}(\omega ^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\epsilon h}(q^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\beta h}(\omega ^{n+\frac{1}{2}},\psi )\nonumber \\&\quad =(\rho ^n,\psi )\;\;\forall \psi \in V_h, \end{aligned}$$
(5.6)

with \(q^n:=u_{ht}^n-p^n\).

From (5.1), it is easy to observe that

$$\begin{aligned} \partial _\tau \omega ^n=q^{n+\frac{1}{2}}+\partial _\tau u_h^n-u_{ht}^{n+\frac{1}{2}}=q^{n+\frac{1}{2}}+\alpha ^n,\; \alpha ^n:=\partial _\tau u_h^n-u_{ht}^{n+\frac{1}{2}}, \end{aligned}$$
(5.7)

so that

$$ \begin{aligned} \omega ^n=\tau \sum _{k=0}^{n-1}\partial _\tau \omega ^k=\tau \sum _{k=0}^{n-1}q^{k+\frac{1}{2}}+\tau \sum _{k=0}^{n-1}\alpha ^k\;\; \& \;\; q^n=\tau \sum _{k=0}^{n-1}\partial _\tau q^k. \end{aligned}$$

Here, we have used the fact that \(\omega ^0=u_h^0-U^0=\mathcal{Q}_hu_0-\mathcal{Q}_hu_0=0\) and \(q^0=u_{ht}^0-p^0=\mathcal{Q}_hv_0-\mathcal{Q}_hv_0=0.\)

Hence, using the above relations it follows that

$$\begin{aligned} \partial _\tau \omega ^n= & {} \frac{\tau }{2}\Bigg (\sum _{k=0}^{n}\partial _\tau q^k+\sum _{k=0}^{n-1}\partial _\tau q^k\Bigg )+\alpha ^n, \end{aligned}$$
(5.8)
$$\begin{aligned} \omega ^{n+\frac{1}{2}}= & {} \frac{\tau }{2}\Bigg (\sum _{k=0}^{n}q^{k+\frac{1}{2}}+\sum _{k=0}^{n-1}q^{k+\frac{1}{2}}\Bigg )+\frac{\tau }{2}\Bigg (\sum _{k=0}^{n}\alpha ^k+\sum _{k=0}^{n-1}\alpha ^k\Bigg ). \end{aligned}$$
(5.9)

Now, we define a sequence \(\{s^n\}_{n=0}^N\) such that \(s^0=0\) and

$$\begin{aligned} s^n=\tau \sum _{k=0}^{n-1} \omega ^{k+\frac{1}{2}},\;n=1,\ldots ,N-1, \end{aligned}$$

so that

$$\begin{aligned} s^{n+\frac{1}{2}}=\frac{\tau }{2}\Bigg (\sum _{k=0}^{n}\omega ^{k+\frac{1}{2}}+\sum _{k=0}^{n-1}\omega ^{k+\frac{1}{2}}\Bigg ). \end{aligned}$$
(5.10)

Hence, for any \(\psi \in V_h\), using the identities (5.8)–(5.10) we obtain

$$\begin{aligned}&(\partial _\tau \omega ^n,\psi )+\mathcal{B}_{\sigma h}(\omega ^{n+\frac{1}{2}},\psi )+\mathcal{B}_{\delta h}(s^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\epsilon h}(\omega ^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\beta h}(s^{n+\frac{1}{2}},\psi )\\&\quad =\frac{\tau }{2}\sum _{k=0}^{n}\Big \{(\partial _\tau q^k,\psi )+\mathcal{B}_{\sigma h}(q^{k+\frac{1}{2}},\psi )+\mathcal{B}_{\delta h}(\omega ^{k+\frac{1}{2}},\psi )+\mathcal{A}_{\epsilon h}(q^{k+\frac{1}{2}},\psi )\\&\;\;\;\;+\mathcal{A}_{\beta h}(\omega ^{k+\frac{1}{2}},\psi )\Big \}+\frac{\tau }{2}\sum _{k=0}^{n-1}\Big \{(\partial _\tau q^k,\psi )+\mathcal{B}_{\sigma h}(q^{k+\frac{1}{2}},\psi )+\mathcal{B}_{\delta h}(\omega ^{k+\frac{1}{2}},\psi )\\&\;\;\;\;+\mathcal{A}_{\epsilon h}(q^{k+\frac{1}{2}},\psi )+\mathcal{A}_{\beta h}(\omega ^{k+\frac{1}{2}},\psi )\Big \}+(\alpha ^n,\psi )+\frac{\tau }{2}\mathcal{B}_{\sigma h}\Big (\sum _{k=0}^{n}\alpha ^k+\sum _{k=0}^{n-1}\alpha ^k,\psi \Big )\\&\;\;\;\;+\frac{\tau }{2}\mathcal{A}_{\epsilon h}\Big (\sum _{k=0}^{n}\alpha ^k+\sum _{k=0}^{n-1}\alpha ^k,\psi \Big ). \end{aligned}$$

Using (5.6), for \(1\le n\le N-1\), we derive

$$\begin{aligned}&(\partial _\tau \omega ^n,\psi )+\mathcal{B}_{\sigma h}(\omega ^{n+\frac{1}{2}},\psi )+\mathcal{B}_{\delta h}(s^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\epsilon h}(\omega ^{n+\frac{1}{2}},\psi )+\mathcal{A}_{\beta h}(s^{n+\frac{1}{2}},\psi )\nonumber \\&\quad =(T_1^n,\psi )+\mathcal{B}_{\sigma h}(T_2^n,\psi )+\mathcal{A}_{\epsilon h}(T_2^n,\psi )\;\forall \psi \in V_h,~~~ \end{aligned}$$
(5.11)

where

$$ \begin{aligned} T_1^n:=\frac{\tau }{2}\rho ^n+\tau \sum _{k=0}^{n-1}\rho ^k+\alpha ^n\;\; \& \;\;T_2^n:=\frac{\tau }{2}\alpha ^n+\tau \sum _{k=0}^{n-1}\alpha ^k. \end{aligned}$$

Substituting \( \psi =\omega ^{n+\frac{1}{2}}=\partial _\tau s^n \) in (5.11) and making some rearrangements, we arrive at

$$\begin{aligned}&(\omega ^{n+1},\omega ^{n+1})+2\tau \mathcal{B}_{\sigma h}(\omega ^{n+\frac{1}{2}},\omega ^{n+\frac{1}{2}})+\mathcal{B}_{\delta h}(s^{n+1},s^{n+1})+2\tau \mathcal{A}_{\epsilon h}(\omega ^{n+\frac{1}{2}},\omega ^{n+\frac{1}{2}})\\&\quad +\mathcal{A}_{\beta h}(s^{n+1},s^{n+1})=(\omega ^n,\omega ^n)+\mathcal{B}_{\delta h}(s^n.s^n)+\mathcal{A}_{\beta h}(s^n,s^n)+2\tau (T_1^n,\omega ^{n+\frac{1}{2}})\\&\;\;\;\;\;\;\;\;\;\;\;\;+2\tau \mathcal{B}_{\sigma h}(T^n_2,\omega ^{n+\frac{1}{2}})+2\tau \mathcal{A}_{\epsilon h}(T^n_2,\omega ^{n+\frac{1}{2}}). \end{aligned}$$

Next, using Cauchy-Schwartz inequality, coercivity and continuity of the bilinear maps \(\mathcal{B}\) and \(\mathcal{A},\) we obtain

$$\begin{aligned}&(\omega ^{n+1},\omega ^{n+1})+2\tau \Vert \omega ^{n+\frac{1}{2}}\Vert ^2+\mathcal{B}_{\delta h}(s^{n+1},s^{n+1})+2\tau \Vert \omega ^{n+\frac{1}{2}}\Vert _1^2+\mathcal{A}_{\beta h}(s^{n+1},s^{n+1})\\&\quad \le (\omega ^n,\omega ^n)+\mathcal{B}_{\delta h}(S^n,S^n)+\mathcal{A}_{\beta h}(s^n,s^n)+2\tau \Vert T^n_1\Vert \Vert \omega ^{n+\frac{1}{2}}\Vert +2\tau \Vert T^n_2\Vert \Vert \omega ^{n+\frac{1}{2}}\Vert \\&\quad \;\;\;+2\tau \Vert T^n_2\Vert _1\Vert \omega ^{n+\frac{1}{2}}\Vert _1. \end{aligned}$$

Finally, applying the Young’s inequality \( ab\le \kappa a^2+\frac{1}{\kappa }b^2 \) for \(a, b>0 \) and choosing \( \kappa >0 \) appropriately, above relation leads us to

$$\begin{aligned}&(\omega ^{n+1},\omega ^{n+1})+\mathcal{B}_{\delta h}(s^{n+1},s^{n+1})+\mathcal{A}_{\beta h}(s^{n+1},s^{n+1})\nonumber \\&\quad \le (\omega ^n,\omega ^n)+\mathcal{B}_{\delta h}(s^n,s^n)+\mathcal{A}_{\beta h}(s^n,s^n)\nonumber \\&\quad \;\;\;\;+2\tau \Big (\Vert T^n_1\Vert ^2+\Vert T^n_2\Vert ^2+\Vert T^n_2\Vert ^2_1\Big ). \end{aligned}$$
(5.12)

Summing (5.12) from \( n=1 \) to \(n=l-1\) with \(2\le l\le N\), we obtain

$$\begin{aligned} \max _{2\le n\le N}\Vert \omega ^n\Vert ^2\le \Vert \omega ^1\Vert ^2+\Vert s^1\Vert ^2_1+2\tau \sum _{n=0}^{l-1}\big (\Vert T^n_1\Vert ^2+\Vert T^n_2\Vert _1^2\big ). \end{aligned}$$
(5.13)

For estimation of the terms \(\omega ^1\) and \(s^1\), we note that

$$ \begin{aligned} s^1=\tau \omega ^\frac{1}{2}=\frac{\tau }{2}\omega ^1 \;\; \& \;\; q^\frac{1}{2}=\frac{q^1}{2}=\frac{\omega ^1}{\tau }-\alpha ^0. \end{aligned}$$

Now, putting \( n=0 \) in the error Eq. (5.6) and using the above identities, we have

$$\begin{aligned}&\frac{2}{\tau ^2}(\omega ^1,\psi )+\frac{1}{\tau }\mathcal{B}_{\sigma h}(\omega ^1,\psi )+\frac{1}{2}\mathcal{B}_{\delta h}(\omega ^1,\psi )+\frac{1}{\tau }\mathcal{A}_{\epsilon h}(\omega ^1,\psi )+\frac{1}{\tau }\mathcal{A}_{\beta h}(s^1,\psi )\nonumber \\&\quad =(\rho ^0,\psi )+\frac{2}{\tau }(\alpha ^0,\psi )+\mathcal{B}_{\sigma h}(\alpha ^0,\psi )+\mathcal{A}_{\epsilon h}(\alpha ^0,\psi )\;\;\forall \psi \in V_h. \end{aligned}$$
(5.14)

Substituting \( \psi =\omega ^1=\frac{2}{\tau } \) in (5.14) and using coercivity of the operators \( \mathcal{B} \) and \( \mathcal{A} \), we obtain

$$\begin{aligned} \Vert \omega ^1\Vert ^2+\Vert s^1\Vert _1^2\le \frac{\tau ^2}{2}(\rho ^0,\omega ^1)+\tau (\alpha ^0,\omega ^1)+\frac{\tau ^2}{2}\mathcal{B}_{\sigma h}(\alpha ^0,\omega ^1)+\tau \mathcal{A}_{\epsilon h}(\alpha ^0,s^1). \end{aligned}$$

Next, use Cauchy–Schwartz and Young’s inequality to have

$$\begin{aligned} \Vert \omega ^1\Vert ^2+\Vert s^1\Vert _1^2\le & {} \frac{\tau ^4}{4}\Vert \rho ^0\Vert ^2+\kappa _1\Vert \omega ^1\Vert ^2+\Big (\tau ^2+\frac{\tau ^4}{4}\Big )\Vert \alpha ^0\Vert ^2\\&+\kappa _2\Vert \omega ^1\Vert ^2+\tau ^2\Vert \alpha ^0\Vert _1^2+\kappa _3\Vert s^1\Vert _1^2. \end{aligned}$$

Finally, choosing \( \kappa _i>0 \) appropriately leads us to

$$\begin{aligned} \Vert \omega ^1\Vert ^2+\Vert s^1\Vert _1^2\le C\Big (\tau ^4\Vert \rho ^0\Vert ^2+\tau ^2\Vert \alpha ^0\Vert _1^2\Big ). \end{aligned}$$
(5.15)

Combining (5.13) and (5.15), we have

$$\begin{aligned} \max _{1\le n\le N}\Vert \omega ^n\Vert ^2\le C\Big (\tau ^4\Vert \rho ^0\Vert ^2+\tau ^2\Vert \alpha ^0\Vert _1^2+2\tau \sum _{n=0}^{l-1}\big (\Vert T^n_1\Vert ^2+\Vert T^n_2\Vert _1^2\big )\Big ). \end{aligned}$$
(5.16)

Now, we shall estimate both terms \(T_1^n\) and \(T_2^n\). For the estimation of \(T_1^n\), use triangle inequality and Cauchy-Schwartz inequality to have

$$\begin{aligned} \Vert T_1^n\Vert ^2\le & {} C\Big (\frac{\tau ^2}{4}\Vert \rho ^n\Vert ^2+\tau ^2\Vert \sum _{k=0}^{n-1}\rho ^k\Vert ^2+\Vert \alpha ^n\Vert ^2\Big )\\\le & {} C\Big (\frac{\tau ^2}{4}\Vert \rho ^n\Vert ^2+\tau ^2N\sum _{k=0}^{n-1}\Vert \rho ^k\Vert ^2+\Vert \alpha ^n\Vert ^2\Big )\\\le & {} C\Big (\frac{\tau ^2}{4}\Vert \rho ^n\Vert ^2+\tau \sum _{k=0}^{n-1}\Vert \rho ^k\Vert ^2+\Vert \alpha ^n\Vert ^2\Big ). \end{aligned}$$

Then, using Lemma 5.2, we obtain

$$\begin{aligned} \Vert T_1^n\Vert ^2\le & {} C\Bigg (\tau ^5\int _{t_n}^{t_{n+1}}\Vert u_{h}^{\prime \prime \prime \prime }\Vert ^2dt+\tau ^4\int _{0}^{T}\Vert u_{h}^{\prime \prime \prime \prime }\Vert ^2dt\nonumber \\&\;\;\;\;\;+\tau ^3\int _{t_n}^{t_{n+1}}\Vert u_{h}^{\prime \prime \prime }\Vert ^2dt\Bigg ). \end{aligned}$$
(5.17)

The following estimate for \(T_2^n\) is achieved using the same technique as used for deriving \(T_1^n\)

$$\begin{aligned} \Vert T_2^n\Vert _1^2\le C\Bigg (\tau ^5\int _{t_n}^{t_{n+1}}\Vert u_{h}^{\prime \prime \prime }\Vert _1^2dt+\tau ^4\int _{0}^{T}\Vert u_{h}^{\prime \prime \prime }\Vert _1^2dt\Bigg ). \end{aligned}$$
(5.18)

Finally, using (5.17)–(5.18) in (5.16), we obtain

$$\begin{aligned} \max _{1\le n\le N}\Vert \omega ^n\Vert ^2\le C\tau ^4\Bigg (\int _{0}^{T}\Vert u_{h}^{\prime \prime \prime \prime }\Vert ^2dt+\int _{0}^{T}\Vert u_{h}^{\prime \prime \prime }\Vert _1^2dt\Bigg ).\;\;\; \end{aligned}$$

\(\square \)

Now, we are in a position to state the main result of this section.

Theorem 5.1

Let u and \(U^n\) be the solutions of the interface problem (1.2)–(1.4) and the finite element approximation (5.1)-(5.2), respectively. Assume that \(u_0\in H^6(\Omega )\cap H_0^1(\Omega )\), \(v_0\in H^6(\Omega )\cap H_0^1(\Omega )\) and \(f\in H^3(J;H^2(\Omega ))\), then we have

$$\begin{aligned} \max _{0\le n\le N}\Vert u^n-U^n\Vert \le C(u)\Bigg (\bigg (h+\sqrt{\lambda }+\frac{\lambda }{h}\bigg )^2+\tau ^2\Bigg ), \end{aligned}$$

where \(C(u)=C\bigg \{\Vert u_0\Vert ^2_{H^6(\Omega )}+\Vert v_0\Vert ^2_{H^6(\Omega )}+\Vert u\Vert ^2_{H^2(\mathcal{Y})}\bigg \}^{\frac{1}{2}}.\)

Proof

Applying the triangle inequality to

$$\begin{aligned} u^n-U^n=u^n-u_h^n+u_h^n-U^n, \end{aligned}$$

followed by estimates (4.5)–(4.6), Theorem 4.2 and Lemma 5.3 leads to desire result. \(\square \)

Remark 5.1

(a) The proposed fully discrete finite element scheme can be easily extended for the numerical approximation of the solutions to the following IBVP

$$\begin{aligned} u^{\prime \prime }+\sigma u^\prime +\delta u-\nabla \cdot (\epsilon \nabla u^\prime +\beta \nabla u)=f\;\;\text{ in }\;\Omega \times (0, T], \end{aligned}$$
(5.19)

coupled with the following jump conditions

$$\begin{aligned}{}[u]=0,\;\;\Biggl [\epsilon (x){\partial u^\prime \over {\partial \mathbf{n}}}+\beta (x){\partial u \over {\partial \mathbf{n}}}\Biggr ] =g\;\;\;\text{ along }\; \Gamma \times [0,T]. \end{aligned}$$
(5.20)

For numerical validation, we refer to numerical examples 6.1.-6.2.

(b) In developing numerical methods for interface problems, higher order of convergence is always one of the major research goals, because high order methods are more accurate and cost-efficient. Present analysis provides a scope for the generalization of these works to higher order finite element methods by combining the theory in this work with the analysis in [17]. A higher order finite element approximation and its convergence is illustrated in Example 6.3.

6 Numerical Results

In this section, we present some numerical experiments to validate the theoretical findings presented in the previous section. To illustrate the flexibility of the method, different forms of interfaces along with a large scale of variation in the physical coefficients are considered. The nodes of the triangulations of \(\Omega _1\) and \(\Omega _2\) coincide on the interface \(\Gamma \) as stated in Sect. 2. All the numerical computations are done in the time interval \(J=(0,1]\).

Our main emphasis here is to understand the behavior of the true errors obtained in Theorem 5.1 on uniform meshes with uniform time steps. For each quantities of interest we observe its experimental order of convergence (EOC). For a given finite sequence of successive runs (indexed by i), let

  • \( e(i)=\) the error corresponding to the \( L^2 \)-norm and \( H^1 \)-norm on the ith iteration and

  • \( h(i)= \) the mesh size of the run i .

Then the experimental order of convergence (EOC) is computed by

$$\begin{aligned} EOC(i+1)=\frac{log(e(i+1)/e(i))}{log(h(i+1)/h(i))}. \end{aligned}$$

Example 6.1

For our first numerical experiment, we consider a square domain \(\Omega =(-1,1)\times (-1,1)\), where interface \(\Gamma \) is a circle centered at (0, 0) with radius 0.5. We select the data in (5.19)-(5.20) such that the exact solution u is given by

$$\begin{aligned} u(x,y,t)=\left\{ \begin{array}{ll} (r_0^2-r^2)t^2 &{} \hbox {if}\quad r\le r_0,\\ (r_0^2-r^2)t\sin (\pi x)\sin (\pi y) &{} \hbox {if}\quad r>r_0, \end{array} \right. \end{aligned}$$

where \( r^2=x^2+y^2 \) and \( r_0=0.5 \).

Fig. 3
figure 3

Exact solution (left) and triangulation(right) of \( \Omega \) with \( h=0.305091\) (Test Example 6.1)

Table 1 Parameters used in computation (see, Xu et al. [39])

In Fig. 3 we show the exact solution and triangulation of the domain \(\Omega \) with mesh size \(h=0.305091\). In our numerical convergence test, we choose two different sets of physical coefficients borrowed from Xu et al. [39] that corresponds to two different forms of bio heat transfer model. Following [39], physical parameters employed in the computation are as in Table 1. Dual-phase-lag (DPL) bio heat transfer is characterized by thermal relaxation time \(\tau _q\) and phase lag for temperature gradient \(\tau _T\). Vedavarz et al. [37] found that \(\tau _q\) for some biological tissues lies in the range of \(1s-100s\) at room temperature. Following the paper by Mitra et al. [22], we take the thermal lag time (\(\tau _q\)) and phase lag time (\(\tau _T\)) as 16 s and 0.043 s, respectively. Then using Table 1, we have the first set of physical coefficients for the DPL bio heat model:

$$\begin{aligned} ( \sigma ,\delta ,\epsilon , \beta )&=\Big ( \frac{\tau _qw_b\rho _bc_b+\rho c}{\tau _q\rho c}, \frac{w_b\rho _bc_b}{\tau _q\rho c}, \frac{\tau _T\kappa }{\tau _q\rho c}, \frac{\kappa }{\tau _q\rho c}\big )\\&= {\left\{ \begin{array}{ll} \big (0.4325, 0.0231,1.1696\times 10^{-10} , 2.7199 \times 10^{-9}\big ) \,\, \hbox {if}\quad r\le r_0,\\ \big (0.6050, 0.0339, 1.2083\times 10^{-7}, 7.5520 \times 10^{-9} \big )\,\,\; \hbox {if}\quad r>r_0.\end{array}\right. } \end{aligned}$$

In the absence of phase lag time (\( \tau _T \)), Eq. (1.1) reduces to the thermal wave model of bio heat transfer [10, 11]. The second set of physical coefficients that corresponds to the thermal wave model of bio heat transfer is given by

$$\begin{aligned} ( \sigma ,\delta ,\epsilon , \beta )= {\left\{ \begin{array}{ll} \big (0.4325, 0.0231,0, 2.7199 \times 10^{-9}\big ) \,\, \hbox {if}\quad r\le r_0,\\ \big (0.6050, 0.0339,0, 7.5520 \times 10^{-9} \big )\,\, \hbox {if}\quad r>r_0.\end{array}\right. } \end{aligned}$$
Table 2 Example 6.1. EOC for \(\tau _T\ne 0\) at \(t=1\) and \(\tau =10^{-3}\)
Table 3 Example 6.1. EOC for \(\tau _T=0\) at \(t=1\) and \(\tau =10^{-3}\)

Tables 2 and 3 represent the numerical solution errors and convergence rates in both \(L^2\) and \(H^1\) norms for \(\tau _T\ne 0\) (DPL bio heat transfer) and \(\tau _T=0\) (thermal wave bio heat transfer), respectively. In both cases, we choose the uniform time step size \(\tau =10^{-3}\). The errors at time \(t=1\) are listed in the Tables 2 and 3. Figure 4 clearly demonstrates the second order of convergence in \(L^2\) norm and first order of convergence in \(H^1\) norm. Note that the second set of physical coefficients are chosen to emphasize the fact that our numerical scheme is consistent for the thermal wave model of bio heat transfer and is clearly depicted in Table 3.

Fig. 4
figure 4

Log-log plot of the \(L^2\) norm and \(H^1\) norm versus the mesh size at time \(t=1\) in Example 6.1

Fig. 5
figure 5

Exact solution (left) and triangulation(right) of \(\Omega \) with \(h=0.286172\) (Test Example 6.2)

Example 6.2

For our second numerical example, we consider the interface to be a curve given by \( y=x^2 \) in the computational domain \( \Omega =(-1,1)\times (-1,1) \). We select the data appearing in (5.19)–(5.20) setting the exact solution as

$$\begin{aligned} u(x,y,t)=\left\{ \begin{array}{ll} 0.25\exp (t)(y-x^2)\sin (\pi x)\sin (\pi y) &{} \hbox {if}\quad y\le x^2,\\ -5t^2(y-x^2)(y-1) &{} \hbox {if}\quad y>x^2. \end{array} \right. \end{aligned}$$

With the development of high-power short impulse lasers, use of dual-phase-lag (DPL) model has become common in the study of heat transport in metallic films during ultrafast laser heating [28, 34]. The phase-lag time varies for different materials and it may take values in the range of \(10^{-3}s-10^3s\) for heterogeneous materials (cf. [21]). To mark the significance of our model problem, we choose the physical coefficients from the paper by Tzou et al. [32]

$$\begin{aligned} ( \sigma ,\delta ,\epsilon , \beta )&=\Big ( \frac{C_E^2}{\alpha _E}, 0,\alpha _e, C_E^2 \Big )\\ {}&= {\left\{ \begin{array}{ll} \big ( 1.2\times 10^{12},0, 1.2\times 10^{-4} , 1.44\times 10^{8} \big ) \,\, \hbox {if} \quad y\le x^2,\\ \big ( 1.2\times 10^{12},0, 1.6\times 10^{-4}, 1.96 \times 10^{8} \big )\,\,\; \hbox {if}\quad y>x^2.\end{array}\right. } \end{aligned}$$
Table 4 Example 6.2. EOC at \(t=10^{-12}\) and \(\tau =10^{-14}\)
Fig. 6
figure 6

Log-log plot of the \( L^2 \) norm and \( H^1 \) norm versus the mesh size at time \( t=10^{-12} \) in Example 6.2

Here \(C_E\) represents the equivalent thermal wave speed, \(\alpha _E\) denotes the equivalent thermal diffusivity and \(\alpha _e\) is the electron thermal diffusivity of the material. In Fig. 5, we show the exact solution and the triangulation of the domain \(\Omega \) with mesh size \(h=0.285956\). The numerical solution errors and convergence rates in both \( L^2 \) and \( H^1 \) norms at final time \(t=10^{-12}\) are listed in Table 4. The final time step is taken in pico-second (ps) as the thermal lagging model describes the pico-second (ps) heat transport in metal films (cf. [32, 34]). It is clear from Fig. 6 that we have achieved optimal order of convergence in both \(L^2\) and \(H^1\) norms, which confirm the theoretical prediction as proved in Theorem 5.1

Fig. 7
figure 7

Exact solution (left) and triangulation(right) of \( \Omega \) with \( h=0.2969850 \) (Test Example 6.3)

Example 6.3

For our final numerical example, the computational domain \( \Omega =(-1,1)\times (-1,1) \) is divided into four subdomains \( \Omega _i \), \( i=1,2,3,4 \) using the interface \(\Gamma :=\{(x,y)\in \Omega : xy=0\}\). We select the data appearing in (5.19)–(5.20) setting the exact solution as

$$\begin{aligned} u(x,y,t)=\left\{ \begin{array}{ll} -0.5x^2\sin (\pi x)\sin (\pi y^2) &{} \hbox {if}\quad (x,y) \in \Omega _{1},\\ 0.5ty^2\sin (\pi x^2)\sin (\pi y) &{} \hbox {if}\quad (x,y) \in \Omega _{2},\\ 0.5\sin (t)x^2\sin (\pi x)\sin (\pi y^2) &{} \hbox {if}\quad (x,y) \in \Omega _{3},\\ -0.5y^2\sin (\pi x^2)\sin (\pi y) &{} \hbox {if}\quad (x,y) \in \Omega _{2}. \end{array} \right. \end{aligned}$$

In Fig. 7, we show the exact solution and triangulation of the domain \(\Omega \) with mesh size \(h=0.2969850\). Equation (1.2) also represents the linearized Westervelt’s equation for classical model for nonlinear ultrasound propagation through thermoviscous fluids [25]. Following [3, 25], in each subdomain we use different material parameters for the physical coefficients, given in Table 5.

Table 5 Parameters used in computation for Example 6.3 (cf. [3, 25])
Table 6 Example 6.3EOC at \(t=1\) and \(\tau =10^{-2}\) for \(\mathbb {P}_1\) elements
Table 7 Example 6.3EOC at \(t=1\) and \(\tau =10^{-2}\) for \(\mathbb {P}_2\) elements
Table 8 Example 6.3EOC at \(t=1\) and \(\tau =10^{-2}\) for \(\mathbb {P}_3\) elements

Tables 6, 7 and 8 represent the numerical solution errors and convergence rates in both \(L^2\) and \(H^1\) norms for \( \mathbb {P}_1 \), \( \mathbb {P}_2 \) and \( \mathbb {P}_3 \) finite elements, respectively. In all cases, we choose the uniform time step size \(\tau =10^{-2}\). The errors at time \( t=1 \) are listed in the Tables 6, 7 and 8. Note that the finite element spaces \( \mathbb {P}_2\) and \( \mathbb {P}_3 \) are chosen to emphasize the fact that our numerical scheme is consistent for the higher order finite element spaces under the assumption that \( \lambda =\text{ O }(h^3) \) and \( \lambda =\text{ O }(h^4) \), respectively. It is clear from Fig. 8 that we have achieved optimal order of convergence in both \(L^2\) and \(H^1\) norms which consolidates our theoretical findings.

Fig. 8
figure 8

Log-log plot of the \( L^2 \) norm and \( H^1 \) norm versus the mesh size at time \( t=1 \) in Example 6.3

7 Conclusion

Time-dependent interface problems are frequently encountered in scientific computing and many applied sciences. The typical mathematical models are the heat or wave type equations with discontinuous coefficients, which arise when the physical processes involve two or more materials or media with non-identical properties. In this article, we have presented finite element analysis for linear general hyperbolic equations with interfaces. The discretization with respect to space is by the piecewise linear finite elements and in time we have applied the Crank-Nicolsion scheme by setting the governing equation as a first-order system in time. We have established second order convergence in time and optimal order convergence in space with respect to \(L^{\infty }(L^2)\)-norm. Present analysis provides a scope for the generalization of these works to higher order finite element methods under the assumption that \(\lambda =\text{ O }(h^{2p})\) and solution belongs to \(L^{\infty }(L^2(\Omega )\cap H^{p+1}(\Omega _1\cup \Omega _2)),\;p\) is the order of approximating polynomial spaces (see, [17] for elliptic type problems with interfaces). Further, we believe that present work can be easily extended to following linearized Westervelt equation with variable coefficients (cf. [25])

$$\begin{aligned} \gamma (x,t) u^{\prime \prime }+\sigma (x,t) u^\prime -\nabla \cdot (\epsilon \nabla u^\prime +\beta \nabla u)=f(x,t)\;\;\text{ in }\;\Omega \times (0, T], \end{aligned}$$
(7.1)

where \(\Omega \subset \mathbb {R}^3\) is a bounded and convex domain with \(C^2\) smooth interface \(\Gamma \). It is worth to note that only semidiscrete error analysis has been discussed in Nikolić et al. [25] for non-interface problems.

Future work will be focussed on the extension of this theory to the Westervelt’s quasi-linear acoustic wave equation

$$\begin{aligned} c^{-2}u^{\prime \prime }-\nabla \cdot \big (\nabla u(x,t)+\beta (x)\nabla {u^{\prime }}\big )=\gamma (u^2)^{\prime \prime }\;\;\text{ in }\;(0,T]\times \Omega , \end{aligned}$$
(7.2)

with interfaces. Equation (7.2) with interfaces are motivated by lithotripsy where a silicone acoustic lens focuses the ultrasound traveling through a nonlinearly acoustic fluid to a kidney stone (cf. [24]). In [24], the authors have investigated interface coupling problems involving Westervelt equation for different types of boundary conditions. Currently, we are working on the extension of the work of Nikolić et al. [25] for interface problems.