1 Introduction

The problem of determining a source term in parabolic equations from some observations plays an important role in practice [4, 9, 10]. Because of its importance, many researchers devoted their attention to it [1,2,3, 5, 7, 8, 12, 14, 17, 18, 22, 24]. For more details, let \(\varOmega \) be a bounded domain in \(\mathbb R^n\) with boundary \(\partial \varOmega \). Denote the cylinder \( Q := \varOmega \times (0,T],\) where \(T>0\) and \(S := \partial \varOmega \times (0,T]\). Let

$$\begin{aligned}{} & {} a_{ij},\quad i,j \in \{1,2,\ldots ,n\}, b \in L^{\infty }(Q),\end{aligned}$$
(1)
$$\begin{aligned}{} & {} a_{ij} = a_{ji}, \quad i,j \in \{1,2,\ldots ,n\}, \end{aligned}$$
(2)
$$\begin{aligned}{} & {} \lambda \Vert \xi \Vert _{\mathbb {R}^{n}}^{2} \le \sum _{i,j=1}^{n}a_{ij}(x,t)\xi _{i}\xi _{j} \le \varLambda \Vert \xi \Vert _{\mathbb {R}^{n}}^{2}, \quad \forall \xi \in \mathbb {R}^{n},\end{aligned}$$
(3)
$$\begin{aligned}{} & {} 0 \le b(x,t) \le \mu _{1} \quad \text {a.e.\ in} ~Q,\end{aligned}$$
(4)
$$\begin{aligned}{} & {} v \in L^{2}(\varOmega ), \quad F \in L^2(Q), \end{aligned}$$
(5)
$$\begin{aligned}{} & {} \lambda ~\text {and}~ \Lambda ~ \text {be positive constants and}~ \mu _{1} \ge ~0. \end{aligned}$$
(6)

Consider the initial boundary value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial u}{\partial t}-\sum \limits _{i,j=1}^n\frac{\partial }{\partial x_i}\left( a_{ij}(x,t)\frac{\partial u}{\partial x_j}\right) + b(x,t)u =F(x,t), \quad (x,t)\in Q,\\ u(x,t) =0, \quad (x,t)\in S,\\ u(x,0) =v(x), \quad x\in \varOmega . \end{array}\right. } \end{aligned}$$
(7)

Let F have either one of the following forms

$$\begin{aligned} F(x,t)&=f(x,t)\varphi (x,t)+g(x,t),\end{aligned}$$
(8)
$$\begin{aligned} F(x,t)&=f(x)\varphi (x,t)+g(x,t),\end{aligned}$$
(9)
$$\begin{aligned} F(x,t)&=f(t)\varphi (x,t)+g(x,t) \end{aligned}$$
(10)

with \(\varphi (x,t)\in L^2(Q)\) and \(g(x,t)\in L^2(Q)\) being given.

We consider the problem of determining f from N integral observations of the solution u

$$\begin{aligned} l_iu = \int _\varOmega \omega _i(x) u(x,t) dx = z_i(t), \quad t \in (0,T), \ i=1,\dots ,N \end{aligned}$$
(11)

with \(\omega _i(x) \in L^\infty (\varOmega )\), nonnegative almost everywhere and \(\int _\varOmega \omega _i(x)dx > 0\), being weighted functions. Suppose that \(z_i,\ i=1,2,\dots ,N\) are approximately given by \(z_i^\delta \) satisfying

$$\begin{aligned} \Vert z_i-z_i^\delta \Vert _{L^2(0,T)}\le \delta . \end{aligned}$$

These inverse problems may have many solutions, especially in the case f depends on x and t.

Indeed, suppose that the coefficients of (7) are sufficiently smooth. If \(\varphi (x,t)\not =0\) and u(xt) is given for all \((x,t)\in Q=\varOmega \times (0,T)\), the inverse problem has a unique solution

$$\begin{aligned} f(x,t)=\frac{\frac{\partial u}{\partial t}-\sum \limits _{i,j=1}^n\frac{\partial }{\partial x_i}\big (a_{ij}(x,t)\frac{\partial u}{\partial x_j}\big )+ b(x,t)u -g(x,t)}{\varphi (x,t)}. \end{aligned}$$

We show that if there is a u satisfying (11), then there are infinitely many \(u\in C^\infty (Q),\ u|_S=0\) satisfying (11). Indeed, for \(v(x)\in C^\infty (\varOmega )\) satisfying (11), consider the following equation

$$\begin{aligned} \langle \omega _i,v\rangle _{L^2(\varOmega )}=\int _\varOmega \omega _i(x) v(x) dx = 0, \ i=1,2,\dots , N. \end{aligned}$$
(12)

Denote \(\mathcal P=\text {span}\{\omega _1, \omega _2,\dots ,\omega _N\}\). Then \(\mathcal P\) is a subspace of \(L^2(\varOmega )\) and \(\dim \mathcal P\le N.\) So \(\mathcal Q=\mathcal P^\perp \) is an infinite-dimensional space. Moreover, we have presentation \(v=v^1+v^2\), where \(v^1\in \mathcal {P},\) \(v^2\in \mathcal Q\) and \(\int _\varOmega v^1(x) v^2(x) dx = 0.\) It concludes that there are infinite functions \(v\in C^\infty (\varOmega )\) satisfying equation (12). So, there are infinitely many functions \(u(\cdot ,t)\in C^\infty (\varOmega )\) satisfying equation

$$\begin{aligned} \int _\varOmega \omega _i(x)u(x,t) dx = 0, \ i=1,2,\dots , N. \end{aligned}$$

Or, there are infinite functions \(u(\cdot ,t)\in C^\infty (\varOmega )\) satisfying (11). We conclude that the inverse problem of finding f from (11) has infinite solutions. Therefore, we have to introduce a notion to its solution.

This paper is organized as follows. In Section 2 we will describe the variational method with the splitting finite difference scheme to solve the inverse problem. In Section 3 we present the discretized the variational problem and the conjugate gradient method. Finally in Section 4 we simulate the proposed algorithms for some concrete examples.

2 Variational Problem

To introduce the concept of weak solution, we use the standard Sobolev spaces \(H^1(\varOmega )\), \(H^1_0(\varOmega )\), \(H^{1,0}(Q)\) and \(H^{1,1}(Q)\) [11, 21, 23]. Further, for a Banach space B, we define

$$ L^2(0,T;B) = \{u: u(t) \in B ~\text { a.e.~} t \in (0,T)~ \text {and~} \Vert u\Vert _{L^2(0,T;B)} < \infty \}, $$

with the norm

$$ \Vert u\Vert ^2_{L^2(0,T;B)} = \int _0^T\Vert u(t)\Vert _B^2dt. $$

In the sequel, we shall use the space W(0, T) defined as

$$ W(0,T) = \{u: u \in L^2(0,T;H^1_0(\varOmega )), u_t \in L^2(0,T;(H^1_0(\varOmega ))')\}, $$

equipped with the norm

$$ \Vert u\Vert ^2_{W(0,T)} = \Vert u\Vert ^2_{L^2(0,T;H^1_0(\varOmega ))} + \Vert u_t\Vert ^2_{L^2(0,T;(H^1_0(\varOmega ))')}. $$

We note here that \((H^1_0(\varOmega ))' = H^{-1}(\varOmega )\).

The solution of the problem (7) is understood in the weak sense as follows: A weak solution in W(0, T) of the problem (7) is a function \(u(x,t)\in W(0,T)\) satisfying the identity

$$\begin{aligned} \begin{aligned}&\int _0^T(u_t,\eta )_{H^{-1}(\varOmega ),H^1_0(\varOmega )}dt+\int _0^T\int _{\varOmega }\left( \sum _{i,j=1}^na_{ij}(x,t)\frac{\partial u}{\partial x_j}\frac{\partial \eta }{\partial x_i}+b(x,t)u\eta \right) {d}x{{d}}t \\ =&\int _0^T\int _{\varOmega } (f\varphi \eta +g\eta ){{d}}x{{d}}t, \quad \forall \eta \in L^2(0,T;H^1_0(\varOmega )) \end{aligned} \end{aligned}$$
(13)

and

$$\begin{aligned} u(x,0)=v (x), \quad x\in \varOmega . \end{aligned}$$
(14)

Based on the standard hypotheses (1), (2), (3), (4), (5) and 6, the existence and uniqueness of a solution, as well as an a priori estimate to the problem (7), can be established. More precisely, following [23, Chapter IV] and [21, pp. 141–152] there exists a unique solution in W(0, T) of the problem (7). Furthermore, there is a positive constant \(c_d\) independent of \(a_i,b,f,\varphi ,g\) and v such that

$$\begin{aligned} \Vert u\Vert _{W(0,T)} \le c_d \big (\Vert f\varphi \Vert _{L^2(Q)} + \Vert g\Vert _{L^2(Q)} + \Vert v\Vert _{L^2(\varOmega )}\big ). \end{aligned}$$

We denote the solution u(xt) of the problem (7) by u(xtf) or u(f) to emphasize its dependence on f. To identify f from (11), we minimize the misfit functional

$$\begin{aligned} J_0(f) = \frac{1}{2} \sum _{i=1}^N\Vert l_iu(f) - z_i\Vert _{L^2(0,T)}^2 \end{aligned}$$
(15)

with respect to f. However, this minimization problem is unstable and there might be many minimizers to it. Therefore, we minimize the Tikhonov functional instead of (15). In fact, we minimize

$$\begin{aligned} J_\gamma (f) =\frac{1}{2} \sum _{i=1}^N\Vert l_iu(f) - z_i\Vert _{L^2(0,T)}^2 + \frac{\gamma }{2}\ \Vert f- f^*\Vert _{L^2(Q)}^2, f^*\in L_2(Q) \end{aligned}$$
(16)

for the case F has form (8).

$$\begin{aligned} J_\gamma (f) = \frac{1}{2}\sum _{i=1}^N\Vert l_iu(f) - z_i\Vert _{L^2(0,T)}^2 + \frac{\gamma }{2}\ \Vert f- f^*\Vert _{L^2(\varOmega )}^2, f^*\in L_2(\varOmega ) \end{aligned}$$
(17)

for the case F has form (9).

$$\begin{aligned} J_\gamma (f) = \frac{1}{2}\sum _{i=1}^N\Vert l_iu(f) - z_i\Vert _{L^2(0,T)}^2 + \frac{\gamma }{2}\ \Vert f- f^*\Vert _{L^2(0,T)}^2, f^*\in L_2(0,T) \end{aligned}$$
(18)

for the case F has form (10). Here, \(\gamma > 0\) is the Tikhonov regularization parameter, \(f^*\) is an a priori estimation of f. By the standard method, we can prove that \(J_\gamma \) is Fréchet differentiable and derive a formula for its gradient. As \(l_iu(f)\) is affine, the functional \(J_\alpha \) is strictly convex. Hence, it attains a unique minimizer which we call \(f^*-\) least square solution to the inverse problems (7) and (11). As the inverse problem may have many solutions, we will see that the choice of \(f^*\) is crucial for selecting which one among these solutions to the inverse problem.

Indeed, introducing the adjoint problem

$$\begin{aligned} {\left\{ \begin{array}{ll} -\frac{\partial p}{\partial t} - \sum \limits _{i,j=1}^n \frac{\partial }{\partial x_j}\left( a_{ij}(x,t) \frac{\partial p}{\partial x_i} \right) + b(x,t) p =\sum \limits _{i=1}^N\omega _i(x) \left( l_iu(t) - z_i(t) \right) , \quad (x,t) \in Q,\\ p(x,t) = 0,\quad (x,t) \in S,\\ p(x,T) = 0, \quad x\in \varOmega , \end{array}\right. } \end{aligned}$$
(19)

we can prove the following results [17, 19].

Theorem 1

The functional \(J_\gamma \) (8) is Fréchet differentiable and its gradient \(\nabla J_\gamma \) at f has the form

$$\begin{aligned} \nabla J_\gamma (f) = \varphi (x,t)p(x,t) + \gamma (f(x,t)- f^*(x,t)), \end{aligned}$$

where p(xt) is the solution to the adjoint problem (19).

Remark 1

When \(J_\gamma \) has the form in (17) or (18), we have

  1. i)
    $$\begin{aligned} \nabla J_\gamma (f) = \int _0^T \varphi (x,t) p(x,t) dt+ \gamma (f(x)- f^*(x)) \ \text {for the functional\ } 17. \end{aligned}$$
  2. ii)
    $$\begin{aligned} \nabla J_\gamma (f) = \int _\varOmega \varphi (x,t) p(x,t) dx+ \gamma (f(t)- f^*(t)) \ \text {for the functional\ } 18. \end{aligned}$$

2.1 Conjugate Gradient Method

To find the minimizer of (16), we use the conjugate gradient method (CG). It proceeds as follows: Assume that at the \(k-\)th iteration we have \(f^k\). Then the next iteration is

$$\begin{aligned} f^{k+1}=f^k +\alpha ^k d^k, \end{aligned}$$

with

$$\begin{aligned} d^k = {\left\{ \begin{array}{ll} - \nabla J_\gamma (f^k) &{} \text{ if } k=0, \\ - \nabla J_\gamma (f^k) +\beta ^kd^{k-1} &{} \text{ if } k>0, \end{array}\right. } \end{aligned}$$
$$\begin{aligned} \beta ^k = \frac{\Vert \nabla J_\gamma (f^k) \Vert _{L^2(Q)}^2}{\Vert \nabla J _\gamma (f^{k-1}) \Vert _{L^2(Q)}^2}, \end{aligned}$$

and

$$\begin{aligned} \alpha ^k =\text{ argmin}_{\alpha \ge 0}J_\gamma (f^k + \alpha d^k). \end{aligned}$$

To evaluate \(\alpha ^k\) we denote by \(\bar{u}(v, g)\) the solution to the problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial u}{\partial t} - \sum \limits _{i,j=1}^n \frac{\partial }{\partial x_i}\left( a_{ij} (x,t) \frac{\partial u}{\partial x_j} \right) + b(x,t) u =g(x,t), \quad (x,t) \in Q,\\ u (x,t) = 0, \quad (x,t) \in S,\\ u (x,0) = v(x), \quad x\in \varOmega \end{array}\right. } \end{aligned}$$

with \(\tilde{u}[f]\) being the solution to the linear problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial u}{\partial t} - \sum \limits _{i,j=1}^n \frac{\partial }{\partial x_i}\left( a_{ij} (x,t) \frac{\partial u}{\partial x_j} \right) + b(x,t) u =f(x,t)\varphi (x,t), \quad (x,t) \in Q,\\ u (x,t)= 0,\quad (x,t) \in S,\\ u (x,0) = 0,\quad x\in \varOmega . \end{array}\right. } \end{aligned}$$

In this case, the observation operators have the form

$$\begin{aligned} l_iu(f)= l_i\tilde{u}[f] + l_i\bar{u}(v,g) := A_if + l_i\bar{u}(v,g),\ i=1, \dots ,N \end{aligned}$$
(20)

with \(A_i\) being bounded linear operators from \(L^2(Q)\) into \(L_2(0,T)\). We have

$$\begin{aligned} J_\gamma (f^k+\alpha d^k)&= \sum _{i=1}^N\frac{1}{2} \Vert l_i u(f^k + \alpha d^k) -z_i \Vert _{L^2(0,T)}^2 + \frac{\gamma }{2} \Vert f^k +\alpha d^k -f^* \Vert _{L^2(Q)}^2\\&= \sum _{i=1}^N\frac{1}{2} \Vert \alpha A_i d^k + A_i f^k +l_i\bar{u}(v,g)- z_i \Vert _{L^2(0,T)}^2 + \frac{\gamma }{2} \Vert \alpha d^k +f^k-f^* \Vert _{L^2(Q)}^2\\&=\sum _{i=1}^N\frac{1}{2} \Vert \alpha A_i d^k +l_i u(f^k)- z_i \Vert _{L^2(0,T)}^2 + \frac{\gamma }{2} \Vert \alpha d^k +f^k-f^* \Vert _{L^2(Q)}^2. \end{aligned}$$

Differentiating \(J_\gamma (f^k+\alpha d^k)\) with respect to \(\alpha \) and putting \(\frac{\partial J_\gamma (f^k+\alpha d^k)}{\partial \alpha }=0\), after some elementary calculations, we obtain

$$\begin{aligned} \alpha ^k = - \frac{\left\langle d^k, \nabla J_\gamma (f^k)\right\rangle _{L^2(Q)}}{\sum \limits _{i=1}^N\Vert A_i d^k \Vert _{L^2(0,T)}^2 + \gamma \Vert d^k \Vert _{L^2(Q)}^2}. \end{aligned}$$

Since \(d^k = - \nabla _\gamma (f^k)+\beta ^k d^{k-1} , \ r^k = - \nabla J_\gamma (f^k)\) and \(\left\langle r^k, d^{k-1}\right\rangle _{L^2(Q)} = 0\), we have

$$\begin{aligned} \alpha ^k = \frac{\Vert r^k \Vert _{L^2(Q)}^2}{ \sum \limits _{i=1}^N\Vert A_id^k \Vert _{L^2(0,T)}^2 + \gamma \Vert d^k \Vert _{L^2(Q)}^2}, \ k=0, 1, 2, \dots . \end{aligned}$$

Thus, the CG has the form

Step 1: Set \(k=0\), initiate \(f^0.\)

Step 2: Calculate \(r^0 = - \nabla J_\gamma (f^0)\) and set \( d^0 = r^0\).

Step 3: Evaluate

$$\begin{aligned} \alpha ^0 =\frac{\Vert r^0 \Vert _{L^2(Q)}^2}{ \sum \limits _{i=1}^N \Vert A_i d^0 \Vert _{L^2(0,T)}^2 + \gamma \Vert d^0\Vert _{L^2(Q)}^2}. \end{aligned}$$

Set \(f^1=f^0 + \alpha ^0 d^0.\)

Step 4: For \(k = 1,2, \dots \). Calculate

$$\begin{aligned} r^k = - \nabla J_\gamma (f^k), \qquad d^k = r^k + \beta ^k d^{k-1} \end{aligned}$$

with

$$\begin{aligned} \beta ^k = \dfrac{\Vert r^k \Vert _{L^2(Q)}^2}{\Vert r^{k-1}\Vert _{L^2(Q)}^2}. \end{aligned}$$

Step 5: Calculate

$$\begin{aligned} \alpha ^k&= \frac{\Vert r^k \Vert _{L^2(Q)}^2}{ \sum \limits _{i=1}^N\Vert A_i d^k \Vert _{L^2(0,T)}^2 + \gamma \Vert d^k \Vert _{L^2(Q)}^2} . \end{aligned}$$

Update

$$\begin{aligned} f^{k+1} = f^k + \alpha ^k d^k. \end{aligned}$$

2.2 Singular Values

Set

$$\begin{aligned} \mathcal A=(A_1,A_2,\dots ,A_N),\qquad z=(z_1, z_2,\dots , z_n), \end{aligned}$$

where \(A_i\) is defined in (20). The problem of determining f in (7) (f has form in (8) or (9) or (10)) from (11) can be written in the form \(\mathcal A f=z\), where

$$\begin{aligned} \mathcal A: L^2(Q) \left( L^2(\varOmega )~\text {or}~ L^2(0,T)\right) \rightarrow \left( L^2(0,T)\right) ^N. \end{aligned}$$

To characterize the ill-posedness degree of the inverse source problem, we have to estimate the singular values of \(\mathcal A\), i.e., the eigenvalues of \(\mathcal A^*A.\) In doing so, we proceed as follows.

We will present for the case f depends on both time and space variable, i.e., \(\mathcal A: L^2(Q)\rightarrow (L^2(0,T))^N\). For the operator \(A_i\), we have \(A_i^*\tilde{g}=\varphi (x,t)\tilde{p}(x,t)\), where \(\tilde{g}\in L^2(0,T)\) and \(\tilde{p}(x,t)\) is the solution to the adjoint problem

$$\begin{aligned}{\left\{ \begin{array}{ll} -\frac{\partial \tilde{p}}{\partial t} - \sum \limits _{i,j=1}^n \frac{\partial }{\partial x_j}\left( a_{ij}(x,t) \frac{\partial \tilde{p}}{\partial x_i} \right) + b(x,t) \tilde{p} =\omega _i(x) \tilde{g}, \quad (x,t) \in Q,\\ \tilde{p}(x,t) = 0,\quad (x,t) \in S,\\ \tilde{p}(x,T) = 0, \quad x\in \varOmega . \end{array}\right. } \end{aligned}$$

From (20), we have

$$\begin{aligned} J_0(f) = \frac{1}{2} \sum _{i=1}^N\Vert l_i u(f) - z_i\Vert _{L^2(0,T)}^2 = \frac{1}{2} \sum _{i=1}^N\Vert A_if - (z_i-l_i\bar{u}(v,g))\Vert _{L^2(0,T)}^2. \end{aligned}$$

Hence,

$$\begin{aligned} J'_0(f)=\sum _{i=1}^NA_i^*\left( A_if-\big (z_i-l_i\bar{u}(v,g)\big )\right) . \end{aligned}$$

If we take \(z_i\) such that \(z_i=l_i\bar{u}(v,g)\), then due to Theorem 1, we have \(J'_0(f)=\sum _{i=1}^NA_i^*A_if=\varphi (x,t){p^*}(x,t)\), where \(p^*\) is the solution of the adjoint problem

$$\begin{aligned} {\left\{ \begin{array}{ll} -\frac{\partial {p^*}}{\partial t} - \sum \limits _{i,j=1}^n \frac{\partial }{\partial x_j}\left( a_{ij}(x,t) \frac{\partial {p^*}}{\partial x_i} \right) + b(x,t) {p^*} =\sum \limits _{i=1}^N\omega _i(x) l_i\tilde{u}[f] , \quad (x,t) \in Q,\\ {p^*}(x,t) = 0,\quad (x,t) \in S,\\ {p^*}(x,T) = 0, \quad x\in \varOmega . \end{array}\right. } \end{aligned}$$

Thus, if \(f(x,t)\in L^2(Q)\) is given, we can calculate the value \(J'_0(f)=\sum _{i=1}^NA_i^*A_if=\varphi (x,t)p^*(x,t)\). Although we do not know the explicit form of \(A_i^*A\), we can use the Lanczos algorithm [20] to estimate its eigenvalues when we discretize the problem. The algorithm looks as follows:

Initialization: Let \(\beta _0=0, q_0=0\) and an arbitrary vector b, calculate \(q_1=\frac{b}{\Vert b\Vert }\).

Put \(Q=q_1\) and \(k=0.\)

Iteration: For \(k=1,2,3,\dots \)

$$\begin{aligned}&p=\mathcal {A}^* \mathcal {A} q_k,\\&\alpha _k=q_k^Tp,\\&p=p-\beta _{n-1}q_{n-1}-\alpha _kq_k,\\&\beta _k=\Vert p\Vert ,\\&q_{k+1}=\frac{p}{\Vert \beta _k\Vert }. \end{aligned}$$

We will present some numerical examples showing the efficiency of this algorithm in Section 4.

3 Variational Method for Discretized Problem

In this section, we have to restrict some conditions on the domain and coefficients. We start with Problem (13)–(14). First, we suppose that \(\varOmega \) is the open parallelepiped \((0,L_1)\times (0,L_2) \times \cdots \times (0,L_n)\) in \( \mathbb R^n \). Second, in (7), we suppose that \(a_{ij} = 0\), if \(i\ne j\), and for simplicity from now on we denote \(a_{ii}\) by \(a_i\). Following [15, 16, 25] (see also [6, 19]), we subdivide the domain \(\varOmega \) into small cells by the rectangular uniform grid specified by

$$\begin{aligned} 0 = x_i^0< x_i^1 = h_i< \cdots < x_i^{N_i} = L_i, \ i = 1, \dots ,n \end{aligned}$$

with \(h_i = L_i/N_i\) being the grid size in the \(x_i\)-direction, \(i = 1,\dots , n\). To simplify the notation, we denote by \(x^k := (x_1^{k_1}, \dots , x_n^{k_n}) \), where \(k := (k_1, \dots , k_n)\), \(0\le k_i \le N_i\). We also denote by \(h := (h_1,\dots , h_n)\) the vector of spatial grid sizes and \(\varDelta h := h_1\cdots h_n\). Let \(e_i\) be the unit vector in the \(x_i\)-direction, \(i = 1,\dots , n\), i.e., \(e_1 = (1,0,\dots ,0)\) and so on. Denote by

$$\begin{aligned} \omega (k) = \{x\in \varOmega \,:\, (k_i-0.5)h_i\le x_i\le (k_i+0.5)h_i,\ \forall i = 1,\dots , n\}. \end{aligned}$$

In the following, \(\varOmega _h\) denotes the set of the indices of all interior grid points and \(\bar{\varOmega }_h\) denotes the set of the indices of all grid points belonging to \(\bar{\varOmega }_h\), i.e.,

$$\begin{aligned} \begin{aligned} \varOmega _h = \{k = (k_1,\dots ,k_n)\,:\, 1\le k_i \le N_i-1, \ \forall i = 1,\dots , n\}. \end{aligned} \end{aligned}$$

We also make use of the following sets

$$\begin{aligned} \varOmega _h^i = \{k = (k_1,\dots ,k_n)\,:\, 0\le k_i \le N_i-1, 1\le k_j \le N_j-1, \forall j \ne i\} \end{aligned}$$

for \(i = 1,\dots , n\). For a function u(xt) defined in \(Q_T\), we denote by \(u^k(t)\) its approximate value at \((x^k,t)\). We define the following forward finite difference quotient with respect to \(x_i\)

$$\begin{aligned} u_{x_i}^k := \frac{u^{k+e_i}-u^k}{h_i}. \end{aligned}$$

Now, taking into account the homogeneous boundary condition, we approximate the integrals in (13) as follows

$$\begin{aligned} \int _{Q}\frac{\partial u}{\partial t} \eta dx dt\approx & {} \varDelta h\int _{0}^T\sum _{k\in \varOmega _h} \frac{d u^k(t)}{d t}\eta ^k(t) dt, \end{aligned}$$
(21)
$$\begin{aligned} \int _{Q}a_{i}(x,t)\frac{\partial u}{\partial x_i} \frac{\partial \eta }{\partial {x_i}} dx dt\approx & {} \varDelta h\int _{0}^T \sum _{k\in {\varOmega }_{h}^i}a_i^{k+\frac{e_i}{2}}(t)u_{x_i}^k(t)\eta _{x_i}^k(t) dt,\end{aligned}$$
(22)
$$\begin{aligned} \int _{Q}b(x,t)u\eta dx dt\approx & {} \varDelta h\int _{0}^T \sum _{k\in \varOmega _h}b^k(t)u^k(t)\eta ^k(t) dt,\end{aligned}$$
(23)
$$\begin{aligned} \int _{Q}f(x,t)\varphi (x,t)\eta dx dt\approx & {} \varDelta h\int _{0}^T \sum _{k\in \varOmega _h} f^k(t)\varphi ^k(t)\eta ^k(t) dt,\end{aligned}$$
(24)
$$\begin{aligned} \int _{Q}g(x,t)\eta dx dt\approx & {} \varDelta h\int _{0}^T \sum _{k\in \varOmega _h} g^k(t)\eta ^k(t) dt. \end{aligned}$$
(25)

Here \(b^k(t)\), \(f^k(t)\), \(\varphi ^k(t)\), \(g^k(t)\) and \(a_i^{k+\frac{e_i}{2}}(t)\) are approximations to the functions b(xt), f(xt), \(\varphi (x,t)\), g(xt) and \(a_i(x,t)\) at the grid point \(x^k\). More precisely, if these functions are continuous at \(x^k\), we take their approximations by their value at \(x^k\) and \(a_i^{k+\frac{e_i}{2}}(t)=a_i(x^{k+\frac{e_i}{2}},t)\). Otherwise, we take

$$\begin{aligned} b^k(t)=\frac{1}{|\omega (k)|}\int _{\omega (k)}b(x,t)dx,\quad \quad f^k(t)=\frac{1}{|\omega (k)|}\int _{\omega (k)}f(x,t)dx,\\ \varphi ^k(t)=\frac{1}{|\omega (k)|}\int _{\omega (k)}\varphi (x,t)dx,\quad \quad g^k(t)=\frac{1}{|\omega (k)|}\int _{\omega (k)}g(x,t)dx, \end{aligned}$$

and

$$a_i^{k+\frac{e_i}{2}}(t)=\frac{1}{|\omega (k)|}\int _{\omega (k)}a_i(x,t)dx.$$

With the approximations (21), (22), (23), (24) and (25), we have the following discrete analogue of (13)

$$\begin{aligned} \int _{0}^T\Bigg [\sum _{k\in \varOmega _h}\left( \frac{d u^k}{d t} + b^k u^k - f^k\right) \eta ^k +\sum \limits _{i=1}^n\sum _{k\in \varOmega _{h}^i}a_i^{k+\frac{e_i}{2}}u_{x_i}^k\eta _{x_i}^k \Bigg ]{ d}t=0. \end{aligned}$$
(26)

We note that, using the discrete analogue of integration by parts with boundary condition \(u^0=\eta ^0=0\) and \(u^{N_i}=\eta ^{N_i}=0\), we obtain

$$\begin{aligned} \sum _{k\in \varOmega _{h}^i}a_i^{k+\frac{e_i}{2}} u_{x_i}^k\eta _{x_i}^k&=\sum _{k\in \varOmega _{h}^i}a_i^{k+\frac{e_i}{2}}\frac{u^{k+e_i}-u^k}{h_i}\frac{\eta ^{k+e_i}-\eta ^k}{h_i}\nonumber \\&=\sum _{k\in \varOmega _{h}^i}a_i^{k+\frac{e_i}{2}}\frac{u^{k+e_i}-u^k}{h_i^2}\eta ^{k+e_i}-\sum _{k\in \varOmega _{h}^i}a_i^{k+\frac{e_i}{2}}\frac{u^{k+e_i}-u^k}{h_i^2}\eta ^k\\&=\sum _{k\in \varOmega _{h}}\left( a_i^{k-\frac{e_i}{2}}\frac{u^{k}-u^{k-e_i}}{h_i^2}-a_i^{k+\frac{e_i}{2}}\frac{u^{k+e_i}-u^k}{h_i^2}\right) \eta ^k.\nonumber \end{aligned}$$

Hence, replacing this equality into (26), we obtain the following system which approximates the original problem (7)

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{d \bar{u}}{d t} +(\varLambda _1+\cdots +\Lambda _n)\bar{u} - \bar{F} = 0,\\ \bar{u}(0) = \bar{v}, \end{array}\right. } \end{aligned}$$
(27)

with \(\bar{u} = \{u^k, k\in \varOmega _h\}\) being the grid function. The function \(\bar{v}\) is the grid function approximating the initial condition v and

$$\begin{aligned} (\varLambda _i \bar{u})^k=\frac{b^k u^k}{n}+ {\left\{ \begin{array}{ll} \frac{a_{i}^{k-\frac{e_i}{2}}}{h_i^2}\bigl (u^k-u^{k-e_i}\bigl ) - \frac{a_{i}^{k+\frac{e_i}{2}}}{h_i^2}\bigl (u^{k+e_i}-u^k \bigl ), 2\le k_i\le N_i-2,\\ \frac{a_{i}^{k-\frac{e_i}{2}}}{h_i^2} u^k - \frac{a_{i}^{k+\frac{e_i}{2}}}{h_i^2}\bigl (u^{k+e_i}-u^k \bigl ), k_i=1,\\ \frac{a_{i}^{k-\frac{e_i}{2}}}{h_i^2}\bigl (u^k-u^{k-e_i}\bigl )+\frac{a_{i}^{k+\frac{e_i}{2}}}{h_i^2} u^k, k_i=N_i-1 \end{array}\right. } \end{aligned}$$

for \(k\in \varOmega _h\) and

$$\begin{aligned} \bar{F} =\{f^k\varphi ^k+g^k, k\in \varOmega _h\}. \end{aligned}$$

We note that the coefficient matrices \(\Lambda _i\) are positive semi-definite (see, e.g., [19]). The boundedness of the solution of (27) has shown in the following theorem.

Theorem 2

Let \(\bar{u}\) be a solution of the Cauchy problem (27). There exists a constant c independent of h and the coefficients of the equation such that

$$\begin{aligned} \max _{t \in [0,T]} \sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(t)|^2 + \int _{0}^{T} \sum _{i=1}^n\sum _{k\in {\varOmega }_{h}^i} |\bar{u}_{x_i}^k|^2 dt \le c \left( \int _0^{T}\sum _{k\in \bar{\varOmega }_h} |\bar{f}^k|^2dt + \sum \limits _{k\in \bar{\varOmega }_h} |\bar{v}^k|^2\right) . \end{aligned}$$
(28)

Proof

For arbitrary \(t^*\in (0,T]\), set

$$\begin{aligned} \bar{\eta }^k(t)= {\left\{ \begin{array}{ll} \bar{u}^k(t)\quad &{}\text {if } t\in [0,t^*],\\ 0\quad &{}\text {if } t\notin [0,t^*]. \end{array}\right. } \end{aligned}$$

Since

$$\begin{aligned} \int _0^{t^*}dt\sum _{k\in \overline{\varOmega }_h}\bar{u}_t^k(t)\bar{u}^k(t)=\frac{1}{2}\sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(t^*)|^2 -\frac{1}{2}\sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(0)|^2, \end{aligned}$$

and \(\bar{u}^k(0) = \bar{v}\), it follows from (26) that

$$\begin{aligned} \begin{aligned}&\frac{1}{2}\sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(t^*)|^2+\int _{0}^{t^*}\left[ \sum _{k\in \bar{\varOmega }_h} \bar{b}^k|u^k|^2+\sum _{i=1}^n\sum _{k\in {\varOmega }_{h}^i}\bar{a}_{i}^{k}|\bar{u}_{x_i}^k|^2\right] dt\\ =&\int _0^{t^*}\sum _{k\in \bar{\varOmega }_h} \bar{f}^k \bar{u}^kdt+\frac{1}{2}\sum \limits _{k\in \bar{\varOmega }_h} |\bar{v}^k|^2. \end{aligned} \end{aligned}$$
(29)

Multiplying the both sides of the equality (29) by 2, applying Cauchy’s inequality to the first term in the right hand side, noting that \(b^k\ge 0\), we obtain

$$\begin{aligned} \begin{aligned}&\sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(t^*)|^2 +2\int _{0}^{t^*}\sum _{i=1}^n\sum _{k\in {\varOmega }_{h}^i}\bar{a}_{i}^{k} |\bar{u}_{x_i}^k|^2dt\\ \le&\int _0^{t^*}\sum _{k\in \bar{\varOmega }_h} |\bar{f}^k|^2dt +\int _0^{t^*}\sum _{k\in \bar{\varOmega }_h}|\bar{u}^k|^2dt + \sum \limits _{k\in \bar{\varOmega }_h} |\bar{v}^k|^2. \end{aligned} \end{aligned}$$
(30)

Put

$$ y(t) = \sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(t^*)|^2. $$

From (30) we have

$$ y(t^*) \le \int _0^{t^*} y(t) dt + \int _0^{t^*}\sum _{k\in \bar{\varOmega }_h} |\bar{f}^k|^2dt + \sum \limits _{k\in \bar{\varOmega }_h} |\bar{v}^k|^2. $$

Applying Gronwall’s inequality, we obtain

$$\begin{aligned} y(t^*) \le \left( \int _0^{t^*}\sum _{k\in \bar{\varOmega }_h} |\bar{f}^k|^2dt + \sum \limits _{k\in \bar{\varOmega }_h} |\bar{v}^k|^2\right) e^t. \end{aligned}$$
(31)

Hence, we have

$$\begin{aligned} \max _{t \in [0,T]} \sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(t)|^2 \le c \left( \int _0^{T}\sum _{k\in \bar{\varOmega }_h} |\bar{f}^k|^2dt + \sum \limits _{k\in \bar{\varOmega }_h} |\bar{v}^k|^2\right) . \end{aligned}$$

From the conditions (1), (2) and (3) about the coefficient \(a_i\), the inequalities (30) and (31) we have

$$\begin{aligned} \int _{0}^{T}\left( \sum \limits _{k\in \bar{\varOmega }_h} |\bar{u}^k(t)|^2 + \sum _{i=1}^n\sum _{k\in {\varOmega }_{h}^i} |\bar{u}_{x_i}^k|^2\right) dt \le c \left( \int _0^{T}\sum _{k\in \bar{\varOmega }_h} |\bar{f}^k|^2dt + \sum \limits _{k\in \bar{\varOmega }_h} |\bar{v}^k|^2\right) . \end{aligned}$$

Combining the two inequalities, we obtain the inequality (28).\(\square \)

3.1 Time Discretization

To obtain the finite difference scheme for (27), we divide the time interval [0, T] into M sub-intervals by the points \(t_i, i = 0,\dots , M, t_0=0, t_1=\varDelta t,\dots , t_{M} = M\varDelta t=T.\) For simplifying the notation, we set \(u^{k,m} := u^k(t_m)\). We also denote by \(F^{k,m} :=F^k(t_m)\) and \(\varLambda _i^m=\varLambda _i(t_m)\), \(m=0,\dots , M\). In the following, we drop the spatial index for simplifying the notation. The finite difference scheme is written as follows

$$\begin{aligned} {\left\{ \begin{array}{ll} u^{m+1}=m^m+\varDelta t[F^m-(\varLambda ^m_1+\cdots +\varLambda ^m_n)u^m)],\\ u^0=\bar{v}. \end{array}\right. } \end{aligned}$$

3.2 Splitting Method

In order to obtain a splitting scheme for the Cauchy problem (27), we also discrete the time interval in the same with finite difference method. We denote \(u^{m+\delta } :=\bar{u}(t_m+\delta \varDelta t),\varLambda _i^m :=\varLambda _i(t_{m}+\Delta t/2).\) We introduce the following implicit two-circle component-by-component splitting scheme [15]

$$\begin{aligned} \begin{aligned}&\frac{u^{m+\frac{i}{2n}}-u^{m+\frac{i-1}{2n}}}{\varDelta t} +\varLambda _i^m \frac{u^{m+\frac{i}{2n}}+u^{m+\frac{i-1}{2n}}}{4}=0,\quad i = 1, 2,\dots , n-1,\\&\frac{u^{m+\frac{1}{2}}-u^{m+\frac{n-1}{2n}}}{\varDelta t} +\varLambda _n^m \frac{u^{m+\frac{1}{2}}+u^{m+\frac{n-1}{2n}}}{4}= \frac{F^m}{2}+\frac{\varDelta t}{8}\varLambda ^m_nF^m,\\&\frac{u^{m+\frac{n+1}{2n}}-u^{m+\frac{1}{2}}}{\varDelta t}+ \varLambda _n^m \frac{u^{m+\frac{n+1}{2n}}+u^{m+\frac{1}{2}}}{4}= \frac{F^m}{2}-\frac{\varDelta t}{8}\varLambda ^m_nF^m,\\&\frac{u^{m+1-\frac{i-1}{2n}}-u^{m+1-\frac{i}{2n}}}{\varDelta t}+ \varLambda _i^m \frac{u^{m+1-\frac{i-1}{2n}}+u^{m+1-\frac{i}{2n}}}{4}=0, \quad i = n-1, n-2, \dots , 1,\\&u^0=\bar{v}. \end{aligned} \end{aligned}$$
(32)

Equivalently,

$$\begin{aligned} \begin{aligned}&\left( E_i+\frac{\varDelta t}{4}\varLambda ^m_i\right) u^{m+\frac{i}{2n}}= \left( E_i-\frac{\varDelta t}{4}\varLambda ^m_i\right) u^{m+\frac{i-1}{2n}}, \quad i = 1, 2,\dots , n-1,\\&\left( E_n+\frac{\varDelta t}{4}\varLambda ^m_n\right) \left( u^{m+\frac{1}{2}}- \frac{\varDelta t}{2}F^m\right) =\left( E_n-\frac{\varDelta t}{4}\varLambda ^m_n\right) u^{m+\frac{n-1}{2n}},\\&\left( E_n+\frac{\varDelta t}{4}\varLambda ^m_n\right) u^{m+\frac{n+1}{2n}}= \left( E_n-\frac{\varDelta t}{4}\varLambda ^m_n\right) \left( u^{m+\frac{1}{2}}+\frac{\Delta t}{2}F^m\right) ,\\&\left( E_i+\frac{\varDelta t}{4}\varLambda ^m_i\right) u^{m+1-\frac{i-1}{2n}}= \left( E_i-\frac{\varDelta t}{4}\varLambda ^m_i\right) u^{m+1-\frac{i}{2n}},\quad i = n-1, n-2, \dots , 1,\\&u^0=\bar{v}, \end{aligned} \end{aligned}$$
(33)

where \(E_i\) is the identity matrix corresponding to \(\varLambda _i, i=1,\dots , n\). The splitting scheme (33) can be rewritten in the following compact form

$$\begin{aligned} {\left\{ \begin{array}{ll} u^{m+1}=B^m u^{m}+\varDelta tC^m(f^m\varphi ^m+g^m),\quad m =0,\dots ,M-1,\\ u^0=\bar{v}, \end{array}\right. } \end{aligned}$$
(34)

with

$$\begin{aligned} \begin{aligned} B^m=B_1^m\cdots B_n^m B_n^m\cdots B_1^m,\qquad C^m=C_1^m \cdots C_n^m, \end{aligned} \end{aligned}$$

where \(B_i^m := (E_i+\frac{\varDelta t}{4}\varLambda _i^m)^{-1}(E_i-\frac{\varDelta t}{4}\varLambda _i^m), i=1,\dots , n.\)

3.3 Discretized Variational Problem

To complete the variational method for multi-dimensional cases, we use the splitting method for the forward problem and take the discretized functional

$$\begin{aligned} J_0^{h,\varDelta t}(\bar{f}):=\frac{\varDelta t}{2}\sum _{i=1}^N\sum _{m=1}^M\left[ \varDelta h\sum _{k\in \varOmega _{h}}\omega _i^ku^{k,m}(\bar{f})-z_i^m\right] ^2, \end{aligned}$$
(35)

where \(u^{k,m}(\bar{f})\) shows its dependence on the right-hand side term \(\bar{f}\) and m is the index of grid points on time axis. The notation \(\omega _i^k=\omega _i(x^k)\) indicates the approximation of the function \(\omega _i(x)\) in \(\varOmega _h\) at points \(x^k\). Normally, we take as its average over the cell where \(x_k\) is located.

For minimizing the problem (35) by the conjugate gradient method, we first calculate the gradient of objective function \(J_0^{h,\varDelta t}(\bar{f})\) and it is shown by the following theorem

Theorem 3

The gradient \(\nabla J_0^{h,\varDelta t}(\bar{f})\) of the objective function \(J_0^{h,\varDelta t}\) at \(\bar{f}\) is given by

$$\begin{aligned} \nabla J_0^{h,\varDelta t}(\bar{f}) = \varDelta t\sum _{m=0}^{M-1}(C^m)^*\varphi ^m\eta ^m, \end{aligned}$$
(36)

where \(\eta \) satisfies the adjoint problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \eta ^m=(B^{m+1})^*\eta ^{m+1}+\psi ^{m+1},\quad m=M-1,M-2, \dots , 0,\\ \eta ^{M}=0, \end{array}\right. } \end{aligned}$$
(37)

with

$$\psi ^{k,m}=\varDelta h\sum _{i=1}^N\omega _i^k\left( \sum \limits _{k\in \varOmega _{h}}\omega _i^ku^{k,m}-z_i^m\right) ,\ k\in \varOmega _h,\ m=0,\dots ,M.$$

Here the matrix \((B^{m})^*\) is given by

$$\begin{aligned} \begin{aligned} (B^m)^* =&\left( E_1-\frac{\varDelta t}{4}\varLambda _1^m\right) \left( E_1+\frac{\varDelta t}{4}\varLambda _1^m\right) ^{-1}\dots \left( E_n-\frac{\varDelta t}{4}\varLambda _n^m\right) \left( E_n+\frac{\varDelta t}{4}\varLambda _n^m\right) ^{-1}\\&\times \left( E_n-\frac{\Delta t}{4}\varLambda _n^m\right) \left( E_n+\frac{\Delta t}{4}\varLambda _n^m\right) ^{-1}\dots \left( E_1-\frac{\varDelta t}{4}\varLambda _1^m\right) \left( E_1+\frac{\varDelta t}{4}\varLambda _1^m\right) ^{-1}. \end{aligned} \end{aligned}$$

Proof

For an infinitesimally small variation \(\delta \bar{f}\) of \(\bar{f}\), we have from (35) that

$$\begin{aligned} J_0^{h,\varDelta t}(\bar{f}+\delta \bar{f})-J_0^{h,\varDelta t}(\bar{f})= & {} \frac{\varDelta t}{2}\sum _{i=1}^N\sum _{m=1}^{M}\left[ \varDelta h\sum _{k\in \varOmega _{h}}\omega _i^ku^{k,m}(\bar{f}+\delta \bar{f})-z_i^m\right] ^2\nonumber \\{} & {} -\frac{\varDelta t}{2}\sum _{i=1}^N\sum _{m=1}^{M}\left[ \varDelta h\sum _{k\in \varOmega _{h}}\omega _i^ku^{k,m}(\bar{f})-z_i^m\right] ^2\nonumber \\= & {} \frac{\varDelta t}{2}\sum _{i=1}^N\sum _{m=1}^{M}\sum _{k\in \varOmega _{h}}\bigl (\varDelta h\omega _i^kw^{k,m}\bigl )^2\nonumber \\{} & {} + {\varDelta t}\sum _{i=1}^N\sum _{m=1}^{M}{\varDelta h}\sum _{k\in \varOmega _{h}}\omega _i^kw^{k,m}\left[ {\varDelta h}\sum _{k\in \varOmega _{h}}\omega _i^ku^{k,m}(\bar{f})-z_i^m\right] \nonumber \\= & {} \frac{\varDelta t}{2}\sum _{i=1}^N\sum _{m=1}^{M}\sum _{k\in \varOmega _{h}}\bigl ({\varDelta h}\omega _i^kw^{k,m}\bigl )^2+{\varDelta t}\sum _{i=1}^N\sum _{m=1}^{M}{\varDelta h}\sum _{k\in \varOmega _{h}}w^{k,m}\psi _i^{k,m}\nonumber \\= & {} \frac{\varDelta t}{2}\sum _{i=1}^N\sum _{m=1}^{M}\sum _{k\in \varOmega _{h}}\bigl ({\varDelta h}\omega _i^kw^{k,m}\bigl )^2+{\varDelta t}\sum _{i=1}^N\sum _{m=1}^{M}\langle w^m,\psi _i^m\rangle , \end{aligned}$$
(38)

where \(w^{k,m} := u^{k,m}(\bar{f}+\delta \bar{f})-u^{k,m}(\bar{f})\) and \(\psi _i^{k,m}={\varDelta h}\omega _i^k(\sum _{k\in \varOmega _{h}}\omega _i^ku^{k,m}-z_i^m),\ k\in \varOmega _h.\)

It follows from (34) that w is the solution to the problem

$$\begin{aligned} {\left\{ \begin{array}{ll} w^{m+1}=A^mw^m+\varDelta tC^m\delta \bar{f}\varphi ^m,\quad m=0,\dots ,M-1,\\ w^0=0. \end{array}\right. } \end{aligned}$$
(39)

Taking the inner product of both sides of the mth equation of (39) with an arbitrary vector \(\eta ^m\in \mathbb R^{N_1\times \dots \times N_n}\), summing the results over \(m=0,\dots , M-1\), we obtain

$$\begin{aligned} \begin{aligned} \sum _{m=0}^{M-1}\langle w^{m+1},\eta ^m\rangle&=\sum _{m=0}^{M-1}\langle B^mw^m,\eta ^m\rangle +\sum _{m=0}^{M-1}\langle \varDelta tC^m\delta \bar{f}\varphi ^m,\eta ^m\rangle \\&=\sum _{m=0}^{M-1}\langle w^m,\bigl (B^m\bigl )^*\eta ^m\rangle +\sum _{m=0}^{M-1}\langle \varDelta tC^m\delta \bar{f}\varphi ^m,\eta ^m\rangle . \end{aligned} \end{aligned}$$
(40)

Here \(\bigl (B^m\bigl )^*\) is the adjoint matrix of \(B^m\).

Taking the inner product of both sides of the first equation of (37) with an arbitrary vector \(w^{m+1}\), summing the results over \(m=0,\dots , M-1\), we obtain

$$\begin{aligned} \begin{aligned} \sum _{m=0}^{M-1}\langle w^{m+1},\eta ^m\rangle&=\sum _{m=0}^{M-1}\langle w^{m+1},(B^{m+1})^*\eta ^{m+1}\rangle +\sum _{m=0}^{M-1}\langle w^{m+1},\psi ^{m+1}\rangle \\&=\sum _{m=1}^{M}\langle w^{m},(B^{m})^*\eta ^{m}\rangle +\sum _{m=1}^{M}\langle w^{m},\psi ^{m}\rangle . \end{aligned} \end{aligned}$$
(41)

Note that \(w^0=\eta ^M=0,\) from (40) and (41), we have

$$\begin{aligned} \sum _{m=1}^{M}\langle w^m,\psi ^m\rangle = \sum _{m=0}^{M-1}\langle \varDelta tC^m\delta \bar{f}\varphi ^m,\eta ^m\rangle . \end{aligned}$$
(42)

On the other hand, it can be proved by induction that \(\sum _{i=1}^N\sum _{m=1}^{M}\sum _{k\in \varOmega _h}\bigl (\omega _i^kw^{k,m}\bigl )^2=o(\Vert \delta \bar{f}\Vert )\). Hence, from (38) and(42), we obtain

$$\begin{aligned}J_0^{h,\varDelta t}(\bar{f}+\delta \bar{f})-J_0^{h,\varDelta t}(\bar{f})=\sum _{m=0}^{M-1}(\delta \bar{f},\varDelta t(C^m)^*\varphi ^m\eta ^m)+o(\Vert \delta \bar{f}\Vert ). \end{aligned}$$

Consequently, the gradient of the objective function \(J_0^h\) can be written as

$$\begin{aligned} \frac{\partial J_0^{h,\varDelta t}(\bar{f})}{\partial \bar{f}}=\varDelta t\sum _{m=0}^{M-1}(C^m)^*\varphi ^m\eta ^m. \end{aligned}$$

Note that, since the coefficient matrices \(\varLambda _i^m, i=1,\dots ,n, m=0,\dots ,M-1\) are symmetric, we have

$$\begin{aligned} \begin{aligned} (B^m)^*=&\left( E_1-\frac{\varDelta t}{4}\varLambda _1^m\right) \left( E_1+\frac{\varDelta t}{4}\varLambda _1^m\right) ^{-1}\dots \left( E_n-\frac{\varDelta t}{4}\varLambda _n^m\right) \left( E_n+\frac{\varDelta t}{4}\varLambda _n^m\right) ^{-1}\\ {}&\times \left( E_n-\frac{\varDelta t}{4}\varLambda _n^m\right) \left( E_n+\frac{\varDelta t}{4}\varLambda _n^m\right) ^{-1} \left( E_1-\frac{\varDelta t}{4}\varLambda _1^m\right) \left( E_1+\frac{\varDelta t}{4}\varLambda _1^m\right) ^{-1} \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} (C^m)^*&=\left( E_n-\frac{\varDelta t}{4}\varLambda _n^m\right) \left( E_n+\frac{\varDelta t}{4}\varLambda _n^m\right) ^{-1} \left( E_1-\frac{\varDelta t}{4}\Lambda _1^m\right) \left( E_1+\frac{\Delta t}{4}\Lambda _1^m\right) ^{-1}. \end{aligned} \end{aligned}$$

The proof is complete.\(\square \)

The conjugate gradient method for the discretized function (35) can be written by following steps:

Step 1. Given an initial approximation \(f^0\) and calculate the residual \(\hat{r}^0=\sum _{i=1}^N[l_iu(f^0)-z_i]\) by solving the splitting (32) with f being replaced by initial approximation \(f^0\) and set \(k=0\).

Step 2. Calculate the gradient \(r^0=-\nabla J_{\gamma }(f^0)\) given in (36) by solving the adjoint problem (37). Then we set \(d^0=r^0.\)

Step 3. Calculate

$$\begin{aligned} \alpha ^0=\frac{\Vert r^0\Vert ^2}{\sum \limits _{i=1}^N\Vert l_id^0\Vert ^2 + \gamma \Vert d^0\Vert }, \end{aligned}$$

where \(l_id^0\) can be calculated from the splitting scheme (32) with f being replaced by \(d^0\) and \(g(x,t)=0, \ v=0\). Then, we set

$$\begin{aligned} f^1=f^0+\alpha ^0d^0. \end{aligned}$$

Step 4. For \(k=1,2,\dots \), calculate \(r^k=-\nabla J_\gamma (f^k), d^k=r^k+\beta ^{k}d^{k-1},\) where

$$\begin{aligned} \beta ^{k}=\frac{\Vert r^k\Vert ^2}{\Vert r^{k-1}\Vert ^2}. \end{aligned}$$

Step 5. Calculate \(\alpha ^k\)

$$\begin{aligned} \alpha ^k=\frac{\Vert r^k\Vert ^2}{\sum \limits _{i=1}^N\Vert l_id^k\Vert ^2 + \gamma \Vert d^k\Vert }, \end{aligned}$$

where \(l_id^k\) can be calculated from the splitting scheme (32) with f being replaced by \(d^k\) and \(g(x,t)=0, \ v=0\). Then, set

$$f^{k+1}=f^k+\alpha ^k d^k.$$

4 Numerical Example

To illustrate the performance of the proposed algorithm, we present in this section some numerical tests. These algorithms were implemented in Matlab and run on a personal laptop with 11th Gen Intel(R) Core(TM) i5 2.4Mhz 2419 Mhz 4 Core(s) 8 Logical Processors.

4.1 One-Dimensional Problems

In this subsection, we present some numerical examples to estimate singular values and determine f. Let \(\varOmega =(0,1)\) and \(T=1\). Consider the one-dimensional system

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t-(au_{x})_{x}=f\varphi (x,t)+g(x,t),\ x\in (0,1), 0\le t\le 1,\\ u(0,t)=u(1,t)=0,\ 0\le t\le 1,\\ u(x,0)=v, \ x\in (0,1), \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} a=2xt+x^2t+1; v=\sin (2\pi x) \ \text {and}\ \varphi (x,t)=(x^2+1)(t^2+1). \end{aligned}$$

For discretization, we take the grid size to be 0.02 in x and t. We take 3 observations at \(x^{10}=0.2, x^{25}=0.5\) and \(x^{35}=0.7\). The weighted functions \(\omega _i(x), i =1, 2, 3\) are chosen as follows

$$\begin{aligned} \omega _1(x) = {\left\{ \begin{array}{ll} \frac{1}{2\varepsilon }&{} \text{ if } x \in (x^{10} - \varepsilon , x^{10}+ \varepsilon )\\ 0 &{} \text{ otherwise } \end{array}\right. } \quad \text {with~} \varepsilon =0.01, \end{aligned}$$
$$\begin{aligned} \omega _2(x) = {\left\{ \begin{array}{ll} \frac{1}{2\varepsilon }&{} \text{ if } x \in (x^{25} - \varepsilon , x^{25}+ \varepsilon )\\ 0 &{} \text{ otherwise } \end{array}\right. } \quad \text {with~} \varepsilon =0.01, \end{aligned}$$
$$\begin{aligned} \omega _3(x) = {\left\{ \begin{array}{ll} \frac{1}{2\varepsilon }&{} \text{ if } x \in (x^{35} - \varepsilon , x^{35}+ \varepsilon )\\ 0 &{} \text{ otherwise } \end{array}\right. } \quad \text {with~} \varepsilon =0.01. \end{aligned}$$

Approximate singular values of \(\mathcal A\) for the case f depends only on time variable t and space variable x are drawn in Fig. 1. From this figure, we see that the singular values for the case when f depends only on x is much smaller than that for the case f depends only on t. Therefore, the problem of reconstructing \(f=f(x)\) is much more ill-posed than \(f=f(t)\).

Fig. 1
figure 1

Approximation singular values: (a) f depends only on x; (b) f depends only on t

Now we present numerical results for reconstructing f(xt). We test three types of f(xt): smooth, non-smooth and discontinuous in the following examples.

Example 1

$$f(x,t)=\sin (\pi x)\sin (\pi t).$$

Example 2

$$ f(x,t)= {\left\{ \begin{array}{ll} 2t \text{ if } \ t\le 1/2 \text {\ and}\ t\le x\text {\ and}\ x\le 1-t,\\ 2(1-t) \text{ if } \ t\ge 1/2 \text {\ and}\ t\ge x\text {\ and}\ x\ge 1-t,\\ 2x \text{ if } \ x\le 1/2 \text {\ and}\ x\le t\text {\ and}\ t\le 1-x,\\ 2(1-x) \text{ otherwise. } \end{array}\right. } $$

Example 3

$$\begin{aligned} f(x,t)= {\left\{ \begin{array}{ll} 1,\quad \ 0.25\le x,t\le 0.75,\\ 0\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

In all of three above examples, the initial guess \(f^*=0,02(\text {rand}(N_x,M)-0,5)+f\), noisy level \(\delta =0,02\), \(\gamma =10^{-2}\) and the initial iteration of the conjugate gradient method \(f^0=0\). Numerical solutions are presented in Figs. 2, 3 and 4.

Fig. 2
figure 2

Example 1. The exact solution in comparison with the numerical solution: (a) Exact function f(xt); (b) Reconstruction of f; (c) Comparison of the exact and approximation solutions at \(x =0,24\); (d) Comparison of the exact and approximation solutions at \(x =0,5\)

Fig. 3
figure 3

Example 2. The exact solution in comparison with the numerical solution: (a) Exact function f(xt); (b) Reconstruction of f; (c) Comparison of the exact and approximation solutions at \(x =0,24\); (d) Comparison of the exact and approximation solutions at \(x =0,5\)

Fig. 4
figure 4

Example 3. The exact solution in comparison with the numerical solution: (a) Exact function f(xt); (b) Reconstruction of f; (c) Comparison of the exact and approximation solutions at \(x =0,24\); (d) Comparison of the exact and approximation solutions at \(x =0,5\)

4.2 Two-Dimensional Problems

We consider the domain \(\varOmega = (0,1)\times (0,1), \ T=1\) and denote the space variable \(x=(x_1,x_2)\). We take 4 observation distributed in 4 parts: \((0,0.5)\times (0,0.5)\), \((0.5,1)\times (0,0.5)\), \((0.5,1)\times (0.5,1)\) and \((0,0.5)\times (0.5,1)\).

Consider the system

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t-(a_1u_{x_1})_{x_1}-(a_2u_{x_2})_{x_2}+a(x,t)u=f\varphi (x,t)+g(x,t),\ (x,t)\in Q,\\ u(0,x_2,t)=u(1,x_2,t)=u(x_1,0,t)=u(x_2,1,t)=0,\ 0<t\le T,\\ u(x,0)=v,\ x\in \varOmega . \end{array}\right. } \end{aligned}$$

The grid sizes are chosen 0.02 in x and in t. The weighted functions \(\omega _i(x), \ i=1, 2, 3, 4\) are chosen as follows

$$\begin{aligned} \omega _1(x) = {\left\{ \begin{array}{ll} \frac{1}{2\varepsilon }&{} \text{ if } x \in (0,24 - \varepsilon , 0,24+ \varepsilon )\times (0,24- \varepsilon , 0,24+ \varepsilon )\\ 0 &{} \text{ otherwise } \end{array}\right. } \quad \text {with~} \varepsilon =0,01, \end{aligned}$$
$$\begin{aligned} \omega _2(x) = {\left\{ \begin{array}{ll} \frac{1}{2\varepsilon }&{} \text{ if } x \in (0,74 - \varepsilon , 0,74+ \varepsilon )\times (0,24- \varepsilon , 0,24+ \varepsilon )\\ 0 &{} \text{ otherwise } \end{array}\right. } \quad \text {with~} \varepsilon =0,01, \end{aligned}$$
$$\begin{aligned} \omega _3(x) = {\left\{ \begin{array}{ll} \frac{1}{2\varepsilon }&{} \text{ if } x \in (0,24 - \varepsilon , 0,24+ \varepsilon )\times (0,74- \varepsilon , 0,74+ \varepsilon )\\ 0 &{} \text{ otherwise } \end{array}\right. } \quad \text {with~} \varepsilon =0,01, \end{aligned}$$
$$\begin{aligned} \omega _4(x) = {\left\{ \begin{array}{ll} \frac{1}{2\varepsilon }&{} \text{ if } x \in (0,74 - \varepsilon , 0,74+ \varepsilon )\times (0,74- \varepsilon , 0,74+ \varepsilon )\\ 0 &{} \text{ otherwise } \end{array}\right. } \quad \text {with~} \varepsilon =0,01. \end{aligned}$$

We test our algorithm for three cases f: (1) \(f=f(t)\), (2) \(f=f(x)\) and (3) \(f=f(x,t)\).

Example 4

We choose the a priori estimation \(f^*=0\), regularization parameter \(\gamma =10^{-2}\), \(f^0=0\), noise level \(\delta =0,02\) and

$$\begin{aligned} a_1(x,t)&=a_2(x,t)=0.2\big (1-0.5\cos (3\pi x_1)\cos (3\pi x_2)\cos (3\pi t)\big ),\\ a&=x_1^2+x_2^2+2x_1t+1,\ v=\sin (\pi x_1)\sin (\pi x_2),\\ \varphi (x,t)&=(x_1^2+3)(x_2^2+3)(t^2+3). \end{aligned}$$

We suppose that f depends only on the time variable and has the form

  1. 1)
    $$\begin{aligned} f(t)=\sin (2\pi t). \end{aligned}$$
  2. 2)
    $$f(t)={\left\{ \begin{array}{ll} 2t &{} \text {if}\ t<0.5,\\ 2(1-t) &{} \text {otherwise}. \end{array}\right. }$$
  3. 3)
    $$f(t)={\left\{ \begin{array}{ll} 1 &{} \text {if}\ 0.25\le t\le 0.75,\\ 0 &{} \text {otherwise}. \end{array}\right. }$$

The numerical results of Example 4 are shown in Fig. 5.

Fig. 5
figure 5

Example 4: the exact solution in comparison with the numerical solution: (a) f is of the form 1); (b) f is of the form 2); (c) f is of the form 3)

Example 5

We choose the a priori estimation \(f^*=0,02(\text {rand}(N_1,N_2)-0,5)+f\), regularization parameter \(\gamma =10^{-2}\), \(f^0=0\), noise level \(\delta =0,02\) and

$$\begin{aligned} a_1(x,t)&=a_2(x,t)=a=1,\ a=x_1^2+x_2^2+2x_1t+1\\ v&=\sin (\pi x_1)\sin (\pi x_2),\ \varphi (x,t)=(x_1^2+1)(x_2^2+2)(t^2+2). \end{aligned}$$

We suppose that f depends only on the space variable and has the form

  1. 1)
    $$\begin{aligned} f(x_1,x_2)=\sin (\pi x_1)\sin (\pi x_2). \end{aligned}$$
  2. 2)
    $$f(x_1,x_2)={\left\{ \begin{array}{ll} 2x_2 &{} \text {if}\ x_2\le 0.5\ \text {and}\ x_2\le x_1\le 1-x_2,\\ 2(1-x_2)&{} \text {if}\ x_2\ge 0.5\ \text {and}\ x_2\ge x_1\ge 1-x_2,\\ 2x_1&{} \text {if}\ x_1\le 0.5\ \text {and}\ x_1\le x_2\le 1-x_1,\\ 2(1-x_1)&{} \text {otherwise}. \end{array}\right. }$$
  3. 3)
    $$f(x_1,x_2)={\left\{ \begin{array}{ll} 1&{} \text {if}\ 0.25\le x_1\le 0.75 \ \text {and}\ 0.25\le x_2\le 0.75,\\ 0&{} \text {otherwise}. \end{array}\right. }$$

The numerical results of Example 5 are shown in Figs. 6, 7 and 8.

Fig. 6
figure 6

Example 5, form 1): the exact solution in comparison with the numerical solution: (a) Exact function f; (b) Reconstruction of f; (c) Point-wise error; (d) Comparison at \(x_1 =1/2\)

Fig. 7
figure 7

Example 5, form 2): the exact solution in comparison with the numerical solution: (a) Exact function f; (b) Reconstruction of f; (c) Point-wise error; (d) Comparison at \(x_1 =1/2\)

Fig. 8
figure 8

Example 5, form 3): the exact solution in comparison with the numerical solution: (a) Exact function f; (b) Reconstruction of f; (c) Point-wise error; (d) Comparison at \(x_1 =1/2\)

Example 6

We choose the a priori estimation \(f^*=0,02(\text {rand}(N_1,N_2,M)-0,5)+f\), regularization parameter \(\gamma =10^{-2}\), \(f^0=0\), noise level \(\delta =0,02\) and

$$\begin{aligned} a_1(x,t)&=a_2(x,t)=a=0.5,\ a=x_1^2+x_2^2+2x_1t+1\\ v&=\sin (\pi x_1)\sin (\pi x_2),\ \varphi (x,t)=(x_1^2+2)(x_2^2+2)(t^2+2). \end{aligned}$$

We suppose that f depends on both the space and time variable as follows

$$\begin{aligned} f(x_1,x_2,t)=\sin (\pi x_1)\sin (\pi x_2)t. \end{aligned}$$

The results of Example 6 are shown in Fig. 9.

Fig. 9
figure 9

Example 6. The exact solution in comparison with the numerical solution at \(t=1/2\): (a) Exact function f; (b) Reconstruction of f; (c) Point-wise error; (d) Comparison at \(x_1 =1/2\) and \(t=1/2\)

Fig. 10
figure 10

Exact solution and its approximation with \(f^*=0, f^*=2, f^*=5\)

We now discuss on the role of \(f^*.\) We will see that its choice is important in the case the inverse problem has many solutions.

We assume that f depends only on time variable. This guarantee the uniqueness solution to inverse problem. We take some different values for \(f^*\). However, the choice of \(f^*\) does not affect much the numerical solution. The information of this test as in the case f depends only on time variable as in Example 4, regularization parameter \(\gamma =10^{-2}\), \(f^0=0\), noise level \(\delta =0,02\). The numerical results with \(f^*=0, f^*=2\) and \(f^*=5\) are presented in Fig. 10 and Table 1 are not much different from each other.

Table 1 \(L^2-\)error with prediction \(f^*=0, f^*=2, f^*=5\)
Fig. 11
figure 11

The exact solution in comparison with the numerical solution with \(f^*=f^*_1\): (a) Exact solution; (b) Reconstruction of f; (c) Point-wise error

Fig. 12
figure 12

The exact solution in comparison with the numerical solution with \(f^*=f^*_2\): (a) Exact solution; (b) Reconstruction of f; (c) Point-wise error

Fig. 13
figure 13

The exact solution in comparison with the numerical solution with \(f^*=f^*_3\): (a) Exact solution; (b) Reconstruction of f; (c) Point-wise error

Table 2 \(L^2-\)error with the prediction \(f^*_1, f^*_2, f^*_3\)

In the case when the solution is not unique, the choice of \(f^*\) is crucial. As mention above, there may be infinitely many solutions to the inverse problem, the prediction \(f^*\) plays a significant role for selecting the solution. We use the system as in the case f depends both on time and space variables as in Example 6, regularization parameter \(\gamma =10^{-2}\), \(f^0=0\), noise level \(\delta =0,02\). By varying \(f^*\) near f, we can see that the conjugate gradient method will reconstruct the approximation which is closest \(f^*\).

Fig. 14
figure 14

The exact solution in comparison with the its approximation with 9 observations: (a) \(f=\sin (2\pi t)\); (b) \(f={\left\{ \begin{array}{ll} 2t&{} \text {if}\ t<0.5,\\ 2(1-t)&{} \text {otherwise} \end{array}\right. }\); (c) \(f={\left\{ \begin{array}{ll} 1&{} \text {if}\ 0.25\le t\le 0.75,\\ 0&{} \text {otherwise}. \end{array}\right. }\)

Table 3 \(L^2-\)error with 3 observations and 9 observations

In the test, if we choose \(f^*\) by

$$f^*_1=0,02\Big (\text {rand}(N_1,N_2,M)-0,5\Big )+f,$$
$$f^*_2=0,1\Big (\text {rand}(N_1,N_2,M)-0,5\Big )+f,$$
$$f^*_3=0,5\Big (\text {rand}(N_1,N_2,M)-0,5\Big )+f.$$

The numerical results are presented as in Figs. 11, 12, 13 and Table 2. We can see that if \(f^*\) is not close to the exact f, the algorithm cannot reconstruct the chosen f, but maybe the other one.

In the last example, we will test in case we have more observations. The priori estimation \(f^*=0\), noise level \(\delta =0,02\), regularization parameter \(\gamma =10^{-2}\), \(f^0=0\), \(a_1(x,t), a_2(x,t), a(x,t)\) and the initial condition v are chosen as in Example 4. The grid sizes are chosen 0.02 in x and in t. We choose 9 observations in domains \((0,0,34)\times (0,0,34),\ (0,0,34)\times (0,34,0,68), \ (0,0,34)\times (0,68,1), \ (0,34,0,68)\times (0,0,34),\ (0,34,0,68)\times (0,34,0,68), (0,34,0,68) \times (0,68,1), (0,68,1)\ \times (0,0,34), (0,\) \(68,1)\times (0,34,0,68), \ (0,68,1)\times (0,68,1).\) The results for reconstructing f are shown in Fig. 14. The comparison of the error between 3 observations and 9 observations is presented in Table 3. We can see that the numerical results for the case of 9 observations are better than that for the case of 3 observations.