1 Introduction

Over the past decade, time-fractional differential equations have received considerable attention due to their potential applications in modeling physical phenomena that cannot be described by classical diffusion models (Uchaikin 2013). One such phenomenon is anomalous diffusion, also known as subdiffusion (Metzler and Klafter 2000). This type of diffusion has been observed in various transport processes, including those in porous media (Fomin et al. 2011), protein diffusion within cells, movement of material along fractals (Hatano and Hatano 1998), and turbulent fluids and plasmas (Kilbas et al. 2006). For more information, refer to Podlubny (1991) and the references therein.

Recently, several inverse problems related to fractional diffusion equations have been studied. These problems mostly take the form of backward problems (Abdel et al. 2022; Djennadi et al. 2021a, 2021). In these problems, the goal is to determine or recover the source term or initial condition that leads to a known solution of the diffusion equation at a fixed time \(T > 0\). In this paper, we study a backward problem associated with the nonhomogeneous time-fractional diffusion problem:

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial ^{\alpha }_{t}u - \Delta u=f(\textbf{x},t), &{} (\textbf{x},t)\in \Omega \times (0,T), \\ u(\textbf{x},t)=0, &{} (\textbf{x},t)\in \partial \Omega \times (0,T), \\ u(\textbf{x},0)=g(\textbf{x}), &{} \textbf{x}\in \Omega , \end{array}\right. } \end{aligned}$$
(1)

where \(\Omega\) is a bounded domain in \(\mathbb {R}^{d}\) with sufficiently smooth boundary \(\partial \Omega\), \(T>0\) is a given final time, \(\alpha \in (0,1)\), and f is the source term which is assumed to be known exactly. In Eq. (1), \(\Delta u\) denotes the spatial Laplacian of u, and \(\partial ^{\alpha }_{t}u\) stands for the \(\alpha\)th Caputo time-fractional derivative of u.

Finding the density u from a given source term f and initial distribution g is usually termed as the forward problem. However, in many practical situations, we often do not know the initial density of the diffusing substance, but we can measure (observe) the density at a positive moment. Therefore, it is desirable to investigate the important backward problem:

  • Given a noisy measurement \(q^{\delta }(\textbf{x})\) of the final data \(q(\textbf{x})=u(\textbf{x},T)\) satisfying

    $$\begin{aligned} \Vert q^\delta -q\Vert \le \delta , \end{aligned}$$
    (2)

    estimate the initial state \(g(\textbf{x})\).

Here \(\delta >0\) represents the noise level measured in the \(L^{2}\)-norm. Such an inverse problem can be used to recover the initial concentration of a contaminant (or the initial temperature profile in the case of a heat conduction problem) in a sub-diffusive media occupying the domain \(\Omega\) which is important for example in environmental engineering, hydrology, and physics. In such scenarios, knowledge of the initial distribution of a substance is essential to predicting its contamination in porous media like soil. This problem can also be applied to other disciplines, such as image deblurring. In image processing and computer vision, the final data \(q^{\delta }\) represents a blurred image, while g represents the original (sharp) image. Therefore, the backward problem of diffusion can aid in reconstructing the original image from the observed, blurred image.

As previously mentioned, time-fractional differential equations have become increasingly popular due to their promising applications in several fields. As a result, they have been extensively studied, and their analytical aspects and numerical treatments are well-developed. Readers interested in a comprehensive analysis of fractional differential equations can refer to Diethelm (2010), Meerschaert et al. (2009), Agrawal (2002), while (Jiang and Ma 2013; Alqhtani et al. 2022, 2023) provide recent numerical methods. For a recent account of the applications of fractional differential Eqs. (Al-Jamel et al. 2018; Hengamian et al. 2022; Srivastava et al. 2022) are recommended.

Recently, inverse problems related to time-fractional differential equations have been considered. In Trong and Hai (2021), Trong and Hai used the modified quasi-method to construct a stable approximation to the backward problem and gave optimal convergence rates in Hilbert scales. Al-Jamal (2017a) proved uniqueness and stability results concerning the reconstruction of the initial condition from interior measurements. In Li and GUO B, (2013), Li and Guo considered the identification of the diffusion coefficient and the order of the fractional derivative from boundary data. In (Wang et al. 2013), Wang et al. considered time-fractional diffusion equations with variable coefficients and used Tikhonov regularization to solve the corresponding Fredholm integral Eq. Wang and Liu (2013) used the total variation regularization to solve the backward problem from given internal measurements. Deng and Yang (2014) used the idea of reproducing kernel approximation to reconstruct the unknown initial heat distribution from scattered measurements. In Kokila and Nair (2020), Kokila and Nair utilized the Fourier truncation method to solve the nonhomogeneous time-fractional backward heat conduction problem. In Yang et al. (2019), Yang et al. used the truncation regularization technique to solve the backward problem for nonhomogeneous time-fractional diffusion-wave equation. Tuan et al. (2017) used the filter regularization method to determine the initial data from the final value with deterministic and random noise. See also (Al-Jamal et al. 2017; Djennadi et al. 2021b) for source identification problems, Al-Jamal (2017b),Wang and Liu (2012) for backward problems, and Jin and Rundell (2015), Djennadi et al. (2020) for other inverse problems in fractional differential equations.

Contrary to the forward problem, the backward problem is ill-posed in the sense that small perturbations in the final data result in large errors in the computed initial data. This instability behavior will be demonstrated in the sequel. Therefore, some regularization is required to obtain stable solutions. In this paper, we utilize Tikhonov regularization to tackle such instability of the backward problem. With the aid of the singular value expansion of the forward map, an explicit formula for the regularized solution will be provided. We will prove convergence results and derive convergence rates under both a priori and a posteriori parameter choice rules of the regularization parameter. The suggested method can be easily implemented, particularly for cubical domains where the fast Fourier transform can be used. The results of the numerical experiments are in good agreement with our theoretical analysis.

The current work makes a significant contribution by addressing an important inverse problem that has a wide range of scientific applications. Unlike previous works, the proposed method is not limited to one-dimensional domains and does not require a homogeneous source term. In addition, the method is characterized by its ease of applicability and implementation, as well as its robustness and computational speed.

The organization of this paper is as follows. In the next section, we set up notations and terminologies and lay out the necessary background material. In Sect. 3, we introduce the regularization technique and develop the main results. Section 4 is devoted to the practical implementation and numerical experiments.

2 Preliminaries

We use the notation \(L^{2}(\Omega )\) to denote the Hilbert space of square integrable functions on \(\Omega\) with inner product and norm given respectively by

$$\begin{aligned} (u,v)=\int _{\Omega }u(\textbf{x})v(\textbf{x}) d \textbf{x},\quad \Vert u\Vert =(u,u)^{\frac{1}{2}}. \end{aligned}$$
(3)

The Caputo time-fractional derivative of order \(\alpha \in (0,1)\) of u is defined by

$$\begin{aligned} \partial ^{\alpha }_{t}u(\textbf{x},t)=\frac{1}{\Gamma (1-\alpha )}\int _{0}^{t}(t-\xi )^{-\alpha }\dfrac{\partial u}{\partial t}(\textbf{x},\xi )\;d\xi , \end{aligned}$$
(4)

where \(\Gamma (\cdot )\) is the gamma function. The books (Kilbas et al. 2006; Podlubny 1991) provide excellent accounts regarding the history, theory, and applications of fractional calculus.

The Mittag-Leffler function of index \((\alpha ,\beta )\) is defined by

$$\begin{aligned} E_{\alpha ,\beta }(z)=\sum _{k=0}^{\infty }\frac{z^k}{\Gamma (k \alpha +\beta )},\quad z\in \mathbb {C}, \end{aligned}$$
(5)

where \(\alpha >0\) and \(\beta >0\). The notation \(E_{\alpha }(z)\) will be used to denote \(E_{\alpha ,1}(z)\). The following relevant results can be found in Podlubny (1991) and Liu and Yamamoto (2010), respectively.

Lemma 2.1

Let \(\lambda >0\). We have

  1. 1.

    \(\frac{d}{dt}\left[ E_{\alpha }(-\lambda t^\alpha )\right] =-\lambda t^{\alpha -1} E_{\alpha ,\alpha }(-\lambda t^\alpha )\), for \(\alpha >0\), \(t>0\).

  2. 2.

    \(0<E_{\alpha }(-\lambda t)\le 1\), for \(0<\alpha <1\), \(t\ge 0\).

  3. 3.

    \(\partial ^{\alpha }_{t}E_{\alpha }(-\lambda t^\alpha )=-\lambda E_{\alpha }(-\lambda t^\alpha )\), for \(0<\alpha <1\), \(t>0\).

  4. 4.

    \(E_{1/2}(z)=e^{z^2}\text {Erfc}(z)\).

Lemma 2.2

Assume that \(0<\alpha <1\). Then there exist constants \(C_{-},C_{+}>0\) depending only on \(\alpha\) such that

$$\begin{aligned} \frac{C_{-}}{1+t}\le E_{\alpha }(-t)\le \frac{C_{+}}{1+t}, \quad \text {for all } t\ge 0. \end{aligned}$$
(6)

We shall also need the following lemma.

Lemma 2.3

Let \(\beta >0\). Then

$$\begin{aligned} \sup _{z> 0} \left\{ \frac{\beta E_{\alpha }(-z)^p}{E_{\alpha }(-z)^2+\beta }\right\} \le {\left\{ \begin{array}{ll} \beta ^{\frac{p}{2}}, &{} p< 2, \\ \beta , &{} p\ge 2. \end{array}\right. } \end{aligned}$$
(7)

Proof

For \(x>0\) and \(p<2\), the function \(\mu (x)=\beta x^{p}/(x^2+\beta )\) attains its maximum value at \(x_0=\sqrt{\frac{p\beta }{2-p}}\). Thus, for \(p<2\), we have

$$\begin{aligned} \sup _{z> 0} \left\{ \frac{\beta E_{\alpha }(-z)^p}{E_{\alpha }(-z)^2+\beta }\right\} \le \mu (x_0)&= \frac{1}{2}(2-p)^{1-\frac{p}{2}}p^{\frac{p}{2}}\beta ^{\frac{p}{2}} \nonumber \\&\le \beta ^{\frac{p}{2}}. \end{aligned}$$
(8)

For \(p\ge 2\), the second part of Lemma 2.1 yields

$$\begin{aligned} \sup _{z> 0} \left\{ \frac{\beta E_{\alpha }(-z)^p}{E_{\alpha }(-z)^2+\beta }\right\} \le \sup _{z> 0}\left\{ E_{\alpha }(-z)^{p-2} \beta \right\} \le \beta , \end{aligned}$$
(9)

which concludes the proof. \(\square\)

Consider the following Sturm–Liouville eigenvalue problem:

$$\begin{aligned} -\Delta X(\textbf{x})=\lambda X(\textbf{x}),\,\textbf{x}\in \Omega ,\,\, X(\textbf{x})=0,\,\textbf{x}\in \partial \Omega . \end{aligned}$$
(10)

From McOwen (1996), the eigenvalues can be enumerated to form a nondecreasing sequence of positive real numbers \(\{\lambda _n\}\) with \(\lambda _n\rightarrow \infty\), and the corresponding eigenfunctions \(\{X_n\}\) form an orthonormal basis for \(L^{2}(\Omega )\). We define the Hilbert space \(\textbf{H}^{p}(\Omega )\) by

$$\begin{aligned} \textbf{H}^{p}(\Omega )=\left\{ v\in L^{2}(\Omega )\; :\; \left\| v\right\| _p < \infty \right\} , \end{aligned}$$
(11)

where

$$\begin{aligned} \left\| v\right\| _p = \left( \sum _{n=1}^{\infty }\left|\left( v,X_n\right) \right|^2\lambda _{n}^{2p}\right) ^{\frac{1}{2}}. \end{aligned}$$
(12)

Using the separation of variables method, the formal solution to (1) can be expressed as

$$\begin{aligned} u(\textbf{x},t)=\sum _{n=1}^{\infty } T_n (t) X_n(\textbf{x}). \end{aligned}$$
(13)

where \(T_n(t)\) solves the fractional order initial-value problem

$$\begin{aligned} D^\alpha T_n(t)+\lambda _n T_n(t)= f_n(t),\quad T_n(0)=g_n, \end{aligned}$$
(14)

with \(f_n(t)=(f(\cdot ,t),X_n)\) and \(g_n=(g,X_n)\). From Diethelm (2010), the solution to the above problem is

$$\begin{aligned} T_n(t)=g_n E_{\alpha }(-\lambda _n t^\alpha )+F_n(t), \end{aligned}$$
(15)

with

$$\begin{aligned} F_{n}(t)=\int _{0}^{t}f_n(t-\tau )\tau ^{\alpha -1} E_{\alpha ,\alpha }(-\lambda _n \tau ^\alpha ) d\tau , \end{aligned}$$
(16)

and therefore, the formal solution to (1) is given by

$$\begin{aligned} u(\textbf{x},t)=\sum _{n=1}^{\infty }\left\{ g_n E_{\alpha }(-\lambda _n t^\alpha )+F_n(t)\right\} X_{n}(\textbf{x}). \end{aligned}$$
(17)

From the final data \(u(\textbf{x},T)=q(\textbf{x})\), we observe that

$$\begin{aligned} q_n=g_n E_{\alpha }(-\lambda _n T^\alpha )+F_n(T), \end{aligned}$$
(18)

where \(q_n=\left( q,X_n\right)\). Thus, the initial condition can be expressed as

$$\begin{aligned} g(\textbf{x})=\sum _{n=1}^{\infty }\frac{\overline{q}_n}{E_{\alpha }(-\lambda _n T^\alpha )} X_{n}(\textbf{x}), \end{aligned}$$
(19)

where \(\overline{q}_n=q_n-F_{n}(T)\).

Define the linear operator \(K:L^{2}(\Omega )\rightarrow L^{2}(\Omega )\) by

$$\begin{aligned} K \varphi = \sum _{n=1}^{\infty } E_{\alpha }(-\lambda _n T^\alpha ) \left( \varphi ,X_n\right) X_n. \end{aligned}$$
(20)

Then, in view of (17), the backward problem can be phrased more concisely as

$$\begin{aligned} K g = \overline{q},\quad \overline{q}:=q-\sum _{n=1}^{\infty }F_n(T) X_{n}. \end{aligned}$$
(21)

We demonstrate the instability of the backward problem by the following example.

Example 1

Assume that \(Kg=\overline{q}\), and consider the sequence of observations of \(\overline{q}\) given by

$$\begin{aligned} \overline{q}^{n}=\overline{q}+E_{\alpha }(-\lambda _n T^\alpha )^{\frac{1}{2}}X_n. \end{aligned}$$
(22)

If we set

$$\begin{aligned} g^n=g+E_{\alpha }(-\lambda _n T^\alpha )^{\frac{-1}{2}}X_n, \end{aligned}$$
(23)

then \(Kg^n=\overline{q}^{n}\) with

$$\begin{aligned} \Vert \overline{q}^{n}-\overline{q}\Vert =E_{\alpha }(-\lambda _n T^\alpha )^{\frac{1}{2}}\rightarrow 0, \end{aligned}$$
(24)

while

$$\begin{aligned} \Vert g^{n}-g\Vert =E_{\alpha }(-\lambda _n T^\alpha )^{\frac{-1}{2}}\rightarrow \infty , \end{aligned}$$
(25)

as \(n\rightarrow \infty\).

The previous example highlights the instability of the backward problem, where even small errors in the final data q can lead to large errors in the computed initial condition g. To overcome this instability, we use regularization, which involves solving a sequence of nearby problems that are parameterized by a regularization parameter. One widely used regularization method is Tikhonov regularization, which we will discuss in the next subsection. Specifically, we will present the method’s application to our backward problem \(K g = \overline{q}\), as given by equations (20)–(21).

3 Tikhonov Regularization of the Backward Problem

As indicated by Example 1, the backward problem lacks continuous dependence on data \(\overline{q}^{\delta }\), and thus regularization is required. In this paper, we consider Tikhonov regularization (Engl et al. 2000).

Following Tikhonov regularization method, the regularized solution, denoted by \(g^{\beta ,\delta }\), is defined to be the solution of the optimization problem

$$\begin{aligned} \min _{\varphi \in L^{2}(\Omega )} \Vert K \varphi -\overline{q}^\delta \Vert ^2 + \beta \Vert \varphi \Vert ^2, \end{aligned}$$
(26)

where \(\beta >0\) is the regularization parameter, and \(\overline{q}^{\delta }=q^{\delta }-\sum _{n=1}^{\infty }F_n(T) X_{n}\). The first term in (26) represents the data fitting term, while the second term represents the regularization term. The rule of the parameter \(\beta\) is to control the trade-off between fitting the data and satisfying the regularization constraint (i.e., the smoothness of the solution).

To obtain an explicit formula for the solution to (26), we observe that the triplet

$$\begin{aligned} \left( E_{\alpha }(-\lambda _n T^\alpha ); X_n, X_n\right) \end{aligned}$$
(27)

form a singular system for K, and consequently, from Engl et al. (2000), the minimizer is given explicitly by

$$\begin{aligned} g^{\beta ,\delta }= \sum _{n=1}^{\infty } \frac{E_{\alpha }(-\lambda _n T^\alpha )}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } \overline{q}^{\delta }_{n} X_n, \end{aligned}$$
(28)

where \(\overline{q}^{\delta }_{n}=(\overline{q}^\delta ,X_n)\).

Next, we analyze the convergence behavior of the proposed method using both a priori and a posteriori parameter choice rules of the regularization parameter \(\beta\). We shall use the notation \(g^{\beta ,0}\) to denote the solution of (26) corresponding to the noise-free data, that is,

$$\begin{aligned} g^{\beta ,0}= \sum _{n=1}^{\infty } \frac{E_{\alpha }(-\lambda _n T^\alpha )}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } \overline{q}_{n} X_n, \end{aligned}$$
(29)

where \(\overline{q}_{n}=(\overline{q},X_n)\).

3.1 A Priori Analysis

In this subsection, we investigate the convergence of the regularization method under a priori choice rule, that is, the choice of the regularization parameter \(\beta\) depends on the noise level \(\delta\). We begin with the following stability result.

Lemma 3.1

It holds that

$$\begin{aligned} \Vert g^{\beta ,\delta }-g^{\beta ,0}\Vert \le \frac{\delta }{2\sqrt{\beta }}. \end{aligned}$$
(30)

Proof

From Lemma 2.3 and its proof, we have

$$\begin{aligned} \Vert g^{\beta ,\delta }-g^{\beta ,0}\Vert ^{2}&=\left\| \sum _{n=1}^{\infty } \frac{E_{\alpha }(-\lambda _n T^\alpha )(\overline{q}^{\delta }_{n}-\overline{q}_n)X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } \right\| ^2 \nonumber \\&= \sum _{n=1}^{\infty } \left( \frac{E_{\alpha }(-\lambda _n T^\alpha )( \overline{q}^{\delta }_{n}-\overline{q}_n)}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^2 \nonumber \\&\le \frac{1}{4\beta } \sum _{n=1}^{\infty } |\overline{q}^{\delta }_{n}-\overline{q}_n|^2= \frac{1}{4\beta } \Vert q^\delta -q\Vert ^2 \nonumber \\ {}&\le \frac{\delta ^2}{4\beta }. \end{aligned}$$
(31)

This ends the proof. \(\square\)

Next, we consider the consistency result:

Lemma 3.2

Assume that \(g\in \textbf{H}^{p}(\Omega )\) for some \(p>0\). Then, the following bound holds

$$\begin{aligned} \Vert g^{\beta ,0}-g\Vert \le M_1 {\left\{ \begin{array}{ll} \beta ^{\frac{p}{2}}, &{} p< 2, \\ \beta , &{} p\ge 2, \end{array}\right. } \end{aligned}$$
(32)

for some constant \(M_1\) independent of \(\beta\).

Proof

From the expansions (19) and (29) it follows that

$$\begin{aligned} g-g^{\beta ,0}&=\sum _{n=1}^{\infty }\frac{\beta g_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } X_n \nonumber \\&= \sum _{n=1}^{\infty }\left( \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )^p}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) \frac{g_n X_n}{E_{\alpha }(-\lambda _n T^\alpha )^p}. \end{aligned}$$
(33)

Thus, in view of the equation (33) and Lemma 2.3, we have

$$\begin{aligned} \Vert g-g^{\beta ,0}\Vert ^{2}\le \left( \sum _{n=1}^{\infty }\frac{|g_n|^2}{E_{\alpha }(-\lambda _n T^\alpha )^{2p}} \right) {\left\{ \begin{array}{ll} \beta ^{p}, &{} p< 2, \\ \beta ^2, &{} p\ge 2. \end{array}\right. } \end{aligned}$$
(34)

From Lemma 2.2, we conclude that

$$\begin{aligned} \sum _{n=1}^{\infty } \frac{|g_n|^2}{E_{\alpha }(-\lambda _n T^\alpha )^{2p}}&\le \sum _{n=1}^{\infty } \left( \frac{1+T^\alpha \lambda _n}{C_{-}}\right) ^{2p} |g_n|^2 \nonumber \\&=\sum _{n=1}^{\infty } \left( \frac{1+T^\alpha \lambda _n}{C_{-}\lambda _n}\right) ^{2p} \lambda ^{2p}_{n}|g_n|^2 \nonumber \\&\le C^{2}_{1} \sum _{n=1}^{\infty } \lambda ^{2p}_{n}|g_n|^2 =C^{2}_{1} \Vert g\Vert ^{2}_{p}, \end{aligned}$$
(35)

where the constant \(C_1\) is given by

$$\begin{aligned} C_{1} = \sup _{n\ge 1} \left( \frac{1+T^\alpha \lambda _n}{C_{-}\lambda _n}\right) ^{p}= \left( \frac{1+T^\alpha \lambda _1}{C_{-}\lambda _1}\right) ^{p}. \end{aligned}$$
(36)

The result now follows from the inequalities (34) and (35) with \(M_1=C_1 \Vert g\Vert _p\). \(\square\)

Using the triangle inequality together with the last two lemmas, we conclude the convergence result:

Theorem 1

Assume that \(g\in \textbf{H}^{p}(\Omega )\) for some \(p>0\). Then, the following error bound holds

$$\begin{aligned} \Vert g^{\beta ,\delta }-g\Vert \le \frac{\delta }{2\sqrt{\beta }} +M_1 {\left\{ \begin{array}{ll} \beta ^{\frac{p}{2}}, &{} p< 2, \\ \beta , &{} p\ge 2, \end{array}\right. } \end{aligned}$$
(37)

for some constant \(M_1\) independent of \(\beta\) and \(\delta\).

Regarding the convergence rate under a priori parameter choice rule, we cite the following remark.

Remark 1

Under the hypotheses of Theorem 1, if we choose \(\beta =C_0\delta ^\gamma\) for some \(\gamma \in (0,2)\) and constant \(C_0>0\), then

$$\begin{aligned} \Vert g^{\beta ,\delta }-g \Vert \rightarrow 0, \end{aligned}$$
(38)

as \(\delta \rightarrow 0\). For a given value of \(p>0\), the convergence rate is optimal when

$$\begin{aligned} \gamma = {\left\{ \begin{array}{ll} \frac{2}{p+1}, &{} p< 2, \\ \frac{2}{3}, &{} p\ge 2, \end{array}\right. } \end{aligned}$$
(39)

in which case we have

$$\begin{aligned} \Vert g^{\beta ,\delta }-g \Vert = {\left\{ \begin{array}{ll} O(\delta ^{\frac{p}{p+1}}), &{} p< 2, \\ O(\delta ^{\frac{2}{3}}), &{} p\ge 2. \end{array}\right. } \end{aligned}$$
(40)

Thus, we obtain the fastest convergence when \(p\ge 2\). In this case, we have

$$\begin{aligned} \Vert g^{\beta ,\delta }-g \Vert =O(\delta ^{\frac{2}{3}}), \end{aligned}$$
(41)

provided \(\beta =C_0\delta ^{\frac{2}{3}}\).

3.2 Convergence Analysis Under a Posteriori Rules

The optimal rate of convergence mentioned in Remark 1 above depends on knowing the value of p. However, in practice, we may not know the exact value of p, and even if we do, any positive \(C_0\) will give an optimal asymptotic rate of convergence as given by (40). But the choice of \(C_0\) can have a significant impact for a given value of \(\delta >0\). Hence, it may be reasonable (and necessary) to take the actual data \(q^{\delta }\) into account when choosing the regularization parameter \(\beta\). A parameter choice method that incorporates both \(\delta\) and \(q^{\delta }\) is known as an a posteriori parameter choice rule. In the this subsection, we will describe one such rule, namely, Morozov’s Discrepancy Principle (MDP) (Engl et al. 2000).

According to this principle, the regularization parameter \(\beta\) is to be chosen so that

$$\begin{aligned} \Vert K g^{\beta ,\delta }-\overline{q}^{\delta }\Vert =\tau \delta , \end{aligned}$$
(42)

for some given constant \(\tau >1\). The main result is stated in the following theorem.

Theorem 2

Assume that \(g\in \textbf{H}^{p}(\Omega )\) for some \(p>0\), and that \(\beta\) is chosen according to (42). Then

$$\begin{aligned} \Vert g^{\beta ,\delta }-g\Vert \le M_2{\left\{ \begin{array}{ll} \delta ^{\frac{p}{p+1}}, &{} p< 1, \\ \delta ^{\frac{p}{p+1}}+\delta ^{\frac{1}{2}}, &{} p\ge 1, \end{array}\right. } \end{aligned}$$
(43)

for some constant \(M_2\) independent of \(\delta\).

Proof

From equation (33) and Holder’s inequality, we get

$$\begin{aligned}&\Vert g^{\beta ,0}-g\Vert ^{2}=\sum _{n=1}^{\infty } \left( \frac{\beta g_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{2} \nonumber \\ =&\sum _{n=1}^{\infty } \left( \frac{\beta E_{\alpha }(-\lambda _n T^\alpha ) g_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{\frac{2p}{p+1}}\nonumber \\&{}\times \left( \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )^{-p} g_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{\frac{2}{p+1}} \nonumber \\ =&\sum _{n=1}^{\infty } \left( \frac{\beta \overline{q}_{n}}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{\frac{2p}{p+1}}\nonumber \\&{}\times \left( \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )^{-p} g_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{\frac{2}{p+1}} \nonumber \\ \le&\left\{ \sum _{n=1}^{\infty } \left( \frac{\beta \overline{q}_{n}}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{2}\right\} ^{\frac{p}{p+1}}\nonumber \\&{}\times \left\{ \sum _{n=1}^{\infty } \left( \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )^{-p} g_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{2}\right\} ^{\frac{1}{p+1}} \nonumber \\ =&\left\| \sum _{n=1}^{\infty } \frac{\beta \overline{q}_{n}X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right\| ^{\frac{2p}{p+1}}\nonumber \\&{}\times \left\| \sum _{n=1}^{\infty } \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )^{-p} g_n X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right\| ^{\frac{2}{p+1}}. \end{aligned}$$
(44)

By the triangle inequality and equation (42), we get

$$\begin{aligned} \left\| \sum _{n=1}^{\infty } \frac{\beta \overline{q}_{n}X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } \right\|&\le \left\| \sum _{n=1}^{\infty } \frac{\beta (\overline{q}_{n}-\overline{q}^{\delta }_{n})X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right\| \nonumber \\&{}+\left\| \sum _{n=1}^{\infty } \frac{\beta \overline{q}^{\delta }_{n}X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right\| \nonumber \\&\le \Vert \overline{q}-\overline{q}^{\delta }\Vert +\Vert Kg^{\beta ,\delta }-\overline{q}^{\delta }\Vert \nonumber \\&\le (1+\tau ) \delta , \end{aligned}$$
(45)

and by the bound in (35), we have

$$\begin{aligned}&\left\| \sum _{n=1}^{\infty } \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )^{-p} g_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } X_n\right\| \nonumber \\&\le \left\{ \sum _{n=1}^{\infty } \left( \frac{g_n}{E_{\alpha }(-\lambda _n T^\alpha )^p}\right) ^2\right\} ^\frac{1}{2}\le C_1 \Vert g\Vert _{p}, \end{aligned}$$
(46)

from which it follows that

$$\begin{aligned} \Vert g^{\beta ,0}-g\Vert \le \left\{ \left( C_1 \Vert g\Vert _p (1+\tau )^p\right) ^{\frac{1}{p+1}}\right\} \delta ^{\frac{p}{p+1}}. \end{aligned}$$
(47)

In virtue of the definition of K given by equation (20), together with expansion (28), equation (42), and the triangle inequality, we get

$$\begin{aligned} \tau \delta = \Vert Kg^{\beta ,\delta }-\overline{q}^{\delta }\Vert&= \left\| \sum _{n=1}^{\infty } \frac{\beta \overline{q}^{\delta }_{n}}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } X_n\right\| \nonumber \\&\le \left\| \sum _{n=1}^{\infty } \frac{\beta (\overline{q}^{\delta }_{n}-\overline{q}_{n})}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } X_n\right\| \nonumber \\&{}+\left\| \sum _{n=1}^{\infty } \frac{\beta \overline{q}_{n}}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } X_n\right\| \nonumber \\&= \left\| \sum _{n=1}^{\infty } \frac{\beta (\overline{q}^{\delta }_{n}-\overline{q}_{n})}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } X_n\right\| \nonumber \\&{}+\left\| \sum _{n=1}^{\infty } \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )g_{n}}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } X_n\right\| . \end{aligned}$$
(48)

Then, the first term on the right-hand side of (48) can be bounded as

$$\begin{aligned} \left\| \sum _{n=1}^{\infty } \frac{\beta (\overline{q}^{\delta }_{n}-\overline{q}_{n})X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta } \right\| ^2&\le \sum _{n=1}^{\infty } (\overline{q}^{\delta }_{n}-\overline{q}_{n})^2\nonumber \\&\le \Vert \overline{q}^{\delta }-\overline{q}\Vert ^2\le \delta ^2, \end{aligned}$$
(49)

and using Lemma 2.3 and inequality (35), the second term on the right-hand side of (48) can be bounded as

$$\begin{aligned}&\left\| \sum _{n=1}^{\infty } \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )g_{n}X_n}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right\| ^2= \nonumber \\&{}\sum _{n=1}^{\infty } \left( \frac{\beta E_{\alpha }(-\lambda _n T^\alpha )^{p+1}}{E_{\alpha }(-\lambda _n T^\alpha )^2+\beta }\right) ^{2} \frac{g^{2}_{n}}{E_{\alpha }(-\lambda _n T^\alpha )^{2p}} \nonumber \\&\le C^{2}_{1} \Vert g\Vert ^{2}_{p} {\left\{ \begin{array}{ll} \beta ^{p+1}, &{} p< 1, \\ \beta ^2, &{} p\ge 1, \end{array}\right. } \end{aligned}$$
(50)

where the constant \(C_1\) is given by (36). By combining the inequalities (48), (49), and (50), it follows that

$$\begin{aligned} \tau \delta \le \delta + C_{1} \Vert g\Vert _{p} {\left\{ \begin{array}{ll} \beta ^{\frac{p+1}{2}}, &{} p< 1, \\ \beta , &{} p\ge 1. \end{array}\right. } \end{aligned}$$
(51)

Given the assumption that \(\tau >1\), a straightforward manipulation of inequality (51) yields that

$$\begin{aligned} \frac{1}{\sqrt{\beta }} \le {\left\{ \begin{array}{ll} \left( \frac{C_1 \Vert g\Vert _p}{\tau -1}\right) ^{\frac{1}{p+1}}\delta ^{\frac{-1}{p+1}}, &{} p< 1, \\ \left( \frac{C_1 \Vert g\Vert _p}{\tau -1}\right) ^{\frac{1}{2}}\delta ^{\frac{-1}{2}}, &{} p\ge 1. \end{array}\right. } \end{aligned}$$
(52)

Consequently, by Lemma 3.1, inequality (30), and the bound in (52), we get

$$\begin{aligned} \Vert g^{\beta ,\delta }-g^{\beta ,0}\Vert&\le \frac{\delta }{2\sqrt{\beta }}\nonumber \\ {}&\le \frac{1}{2}{\left\{ \begin{array}{ll} \left( \frac{C_1 \Vert g\Vert _p}{\tau -1}\right) ^{\frac{1}{p+1}}\delta ^{\frac{p}{p+1}}, &{} p< 1, \\ \left( \frac{C_1 \Vert g\Vert _p}{\tau -1}\right) ^{\frac{1}{2}}\delta ^{\frac{1}{2}}, &{} p\ge 1. \end{array}\right. } \end{aligned}$$
(53)

Finally, from the bounds (47)-(53) and the triangle inequality, we have

$$\begin{aligned} \Vert g^{\beta ,\delta }-g\Vert&\le \Vert g^{\beta ,0}-g\Vert +\Vert g^{\beta ,\delta }-g^{\beta ,0}\Vert \nonumber \\&\le \left( C_1 \Vert g\Vert _p (1+\tau )^p\right) ^{\frac{1}{p+1}} \delta ^{\frac{p}{p+1}} \nonumber \\&{}+ \frac{1}{2}{\left\{ \begin{array}{ll} \left( \frac{C_1 \Vert g\Vert _p}{\tau -1}\right) ^{\frac{1}{p+1}}\delta ^{\frac{p}{p+1}}, &{} p< 1, \\ \left( \frac{C_1 \Vert g\Vert _p}{\tau -1}\right) ^{\frac{1}{2}}\delta ^{\frac{1}{2}}, &{} p\ge 1, \end{array}\right. } \end{aligned}$$
(54)

which concludes the proof of the theorem. \(\square\)

We conclude with the following remark which discusses the convergence rate of the proposed method under the a posteriori parameter choice rule given by the MDP.

Remark 2

In view of Theorem 2, we see that under the Morozov’s discrepancy principle (42), the proposed method is of order \(O(\delta ^\frac{p}{p+1})\) if \(p<1\), with optimal convergence rate \(O(\delta ^\frac{1}{2})\) when \(p\ge 1\).

4 Numerical Illustrations

Next, we will show how to implement the proposed scheme for a practical problem. We treat the case when the domain \(\Omega\) is a cubical domain in \(\mathbb {R}^{d}\).

Since it is often the case that the final data is just a noisy discrete reading of the exact final data \(q(\textbf{x})=u(\textbf{x},T)\), we will assume that the noisy data \(q^\delta\) is generated using the formula

$$\begin{aligned} q^\delta (\textbf{x}_i)=q(\textbf{x}_i)+\eta _i,\quad i=1,2,\ldots ,m, \end{aligned}$$
(55)

where \(\textbf{x}_i\) are regular grid points of \(\Omega\), and \(\eta _i\) are uniform random real numbers in \([-1,1]\). In practical applications, since we typically work with discrete data, it is often preferable to use the root-mean-square (RMS) norm for vectors. The RMS norm of a vector \(\textbf{v}\in \mathbb {R}^m\) is defined as:

$$\begin{aligned} \Vert \textbf{v}\Vert _{\text {RMS}}=\left( \frac{1}{m}\sum _{i=1}^{m} v^{2}_{i}\right) ^{\frac{1}{2}}. \end{aligned}$$
(56)

In this paper, we will designate the symbol \(\delta\) to denote the noise level in the data measured in the root-mean-square norm, that is,

$$\begin{aligned} \delta =\left( \frac{1}{m}\sum _{i=1}^{m}\left( q^\delta (\textbf{x}_i)-q(\textbf{x}_i)\right) ^2\right) ^{\frac{1}{2}}. \end{aligned}$$
(57)

Similarly, we assess the quality of the recovered initial condition via the root-mean-square norm:

$$\begin{aligned} \text {rms}:=\left( \frac{1}{m}\sum _{i=1}^{m}\left( g^{\beta ,\delta }(\textbf{x}_i)-g(\textbf{x}_i)\right) ^2\right) ^{\frac{1}{2}}, \end{aligned}$$
(58)

which is the discrete version of the \(L^{2}\)-error.

In the computations below, the sampling mesh size is fixed to \(m=500^d\). We utilize the fast Fourier transform (FFT) for the computations of \(q^{\delta }_{n}\) and \(f_n\), and we use the midpoint quadrature rule to estimate \(F_n(T)\). The regularization parameter \(\beta\) is chosen using the discrepancy principle with \(\tau =1.1\). In the experiments below, we use the built-in MATHEMATICA routines to compute the Mittag-Leffler function and the FFT as well. The algorithm of the proposed method goes along the following lines:

  • \({\texttt {f}}_{{\texttt {n}}}={\texttt {1/Sqrt[m]}}\, {\texttt {FourierDST[f,1]}}\);

    (\(*\) the Fourier coefficients \(f_n\) of the vector f \(*\))

  • \({\texttt {q}}^{\delta }_{{\texttt {n}}}\) =1/Sqrt[m] FourierDST[ q\(^{\delta }\),1];

    (\(*\) the Fourier coefficients of the noisy data vector \(q^{\delta }\) \(*\))

  • M=MittagLefflerE[\(\alpha , {\texttt {-}}\lambda {\texttt {T}}^\alpha\) ];

    (\(*\) the Mittag-Leffler at all the first m eigenvalues \(*\))

  • F\(_{{\texttt {n}}}\)=(1/\(\lambda\) )(1-M) f\(_{{\texttt {n}}}\) ;

    (\(*\) estimate the all values of \(F_n\) at once \(*\))

  • g\(^{\delta }_{{\texttt {n}}}\) =M/(M\(^2\)+\(\beta\) )(q\(^{\delta }_{{\texttt {n}}}\) -F\(_{{\texttt {n}}}\) );

    (\(*\) the Fourier coefficients \(g^{\delta }_n\) of the approximate solution \(*\))

  • g\(^{\beta ,\delta }\) =FourierDST[Sqrt[m]g\(^{\delta }_{{\texttt {n}}}\) ,1];

    (\(*\) the approximate solution \(*\))

Finally, we note that the verification of the exact solutions for the presented equations below can be directly carried out by applying Lemma 2.1.

Example 2

We consider the 1-D fractional order diffusion equation

$$\begin{aligned} \partial ^{9/10}_{t}u - \partial _{xx}u =f,\,\,0<x<1,\,0<t<0.1, \end{aligned}$$
(59)

supplied with the initial condition

$$\begin{aligned} u(x,0)=g(x)= \left\{ \begin{array}{ll} x, &{} \text {if}\; 0\le x \le 1/2,\\ 1-x, &{} \text {if}\; 1/2< x \le 1.\end{array} \right. \end{aligned}$$
(60)

The source term f is chosen so that the solution is

$$\begin{aligned} u(x,t)=\left( E_{9/10}(-\pi ^2 t^{9/10})-1\right) \sin (\pi x)+g(x). \end{aligned}$$
(61)

Error results for several noise levels \(\delta\) along with estimated order of convergence are summarized in Table 1. The exact final data q and the noisy final data \(q^{\delta }\) when \(\delta =1\%\) are depicted in Fig. 1, while the corresponding exact and recovered initial conditions are shown in Fig. 2.

Figure 2a shows that the naive reconstruction, without regularization, bears no resemblance to the exact solution. However, as illustrated in Fig. 2b, the regularized solution is remarkably close to the exact initial condition. This highlights the importance of regularization and underscores the efficacy of the proposed approach.

For this particular example, the eigenpairs are given by \(\lambda _n=\pi ^2 n^2\) and \(X_n=\sqrt{2}\sin (n \pi x)\). Thus, we obtain that

$$\begin{aligned} \Vert g\Vert ^{2}_{p}=\sum _{n=1}^{\infty }\left( \frac{8 \sin ^{2}(n \pi /2)}{\pi ^{4-4p}}\right) \frac{1}{n^{4-4p}}. \end{aligned}$$
(62)

This shows that \(g\in \textbf{H}^{p}(\Omega )\) for \(p<3/4\). Consequently, according to Remark 2, the theoretical rate of convergence should be approximately \(O(\delta ^{3/7})\). The numerical results reported in Table 1 indicate that the regularized solution converges to the exact solution at a rate very close to the anticipated theoretical order.

Table 1 Root-mean-square errors in the recovered initial condition for Example 2 for several values of noise level \(\delta\) along with the estimated order of convergence
Fig. 1
figure 1

Noisy final data \(q^{\delta }\) for Example 2 corresponding to noise level \(\delta =10^{-2}\)

Fig. 2
figure 2

Recovered initial condition for Example 2 corresponding to noise level \(\delta =10^{-2}\)

Example 3

In this 2-D example, we consider the time-fractional diffusion problem:

$$\begin{aligned} \begin{aligned} \partial ^{1/2}_{t}u-\Delta u&=2x+2y-2x^2-2y^2,\,\,(x,y)\in \Omega ,\\ u(x,y,0)&=\sin (2\pi x)\sin (\pi y)+yx(1-x)(1-y),\\ u(x,y,t)&=0,\,\,(x,y)\in \partial \Omega , \end{aligned} \end{aligned}$$
(63)

where \(\Omega\) is the unit square \([0,1]^2\), and \(0<t<0.1\). The forward solution is

$$\begin{aligned} u =\, & \exp \left( {25\pi ^{4} t} \right){\text{Erfc}}\left( {5\pi ^{2} \sqrt t } \right)\sin (2\pi x)\sin (\pi y) \\ & + yx(1 - x)(1 - y), \\ \end{aligned}$$
(64)

where \(\text {Erfc}(\cdot )\) is the complementary error function. Error results for several noise levels \(\delta\) along with estimated order of convergence are summarized in Table 2. The noisy final data when \(\delta =1\%\) along with the corresponding exact and recovered initial conditions are shown in Fig. 3.

The ill-posedness of the backward problem is starkly manifested in Fig. 3, where the naive reconstruction (i.e., without regularization) bears no resemblance to the exact solution. But through the use of regularization, the regularized solution achieves remarkable fidelity with the exact initial condition, underscoring the efficacy and power of our approach.

The eigenvalues and eigenfunctions in this example are given by

$$\begin{aligned}&\qquad \qquad \qquad \lambda _{m,n}=(m^{2}+n^2)\pi ^2, \end{aligned}$$
(65)
$$\begin{aligned}&X_{m,n}=2\sin (m \pi x)\sin (n \pi y),\,\,\, m,n=1,2,\ldots \end{aligned}$$
(66)

Upon carrying out lengthy yet straightforward computations, it can be concluded that

$$\begin{aligned} \Vert g\Vert ^{2}_{1}&\le \frac{25 \pi ^4}{4}\nonumber \\&{}+\frac{(32)^2}{\pi ^8}\left\{ 2\left( \frac{\pi ^6}{945}\right) \left( \frac{\pi ^2}{6}\right) +2\left( \frac{\pi ^4}{90}\right) \left( \frac{\pi ^4}{90}\right) \right\} \nonumber \\&< \infty , \end{aligned}$$
(67)

which establishes that \(g\in \textbf{H}^{1}(\Omega )\). Therefore, we can anticipate a theoretical rate of convergence of \(O(\delta ^{1/2})\), as suggested by Remark 2. However, the results in Table 2 demonstrate that the regularized solution exhibits a slower rate of convergence than the anticipated theoretical order. We attribute this to the effects of numerical discretization. Nevertheless, our approach still yields an accurate reconstruction of the exact solution, highlighting the practical effectiveness of regularization.

Table 2 Root-mean-square errors in the recovered initial condition for Example 3 for several values of noise level \(\delta\) along with the estimated order of convergence
Fig. 3
figure 3

The noisy final data \(q^{\delta }\) with \(\delta =10^{-2}\) and the recovered initial condition without and with regularization for Example 3

The numerical results of the above experiments show that regularization method is stable and robust with respect to the noise level \(\delta\). Notice that the naive reconstruction (i.e., without regularization) is very poor which reflects the instability of the backward problem. Moreover, the observed convergence rate in these examples are very close to the anticipated convergence rates predicted by our theoretical analysis. The algorithm has a reasonably fast running time in a personal computer. Overall, the proposed method offers several advantages over existing techniques. It is not restricted to one-dimensional domains and can handle equations with nonhomogeneous source terms. Furthermore, it is capable of handling initial data that may not be smooth, which is a desirable feature for practical applications. Additionally, the method is easy to implement and apply, and it is characterized by its robustness and computational efficiency.

5 Conclusion

This paper presents a rigorous investigation of a Tikhonov regularization-based method for reconstructing unknown initial conditions in nonhomogeneous time-fractional diffusion equations from noisy measurements of final data. The proposed method is applicable to higher dimensional domains and is not restricted to homogeneous problems, making it a valuable tool in diverse scientific fields. The method is supported by rigorous theoretical analyses, which establish its consistency, stability, and convergence rates for both a priori and a posteriori parameter choice rules. Impressively, the method is highly robust to noise levels, further underscoring its reliability and practicality. Our numerical results demonstrate excellent agreement with the proven theoretical results, highlighting the accuracy and effectiveness of the proposed approach. Overall, this study makes a significant contribution to the field of inverse problems, with potential applications in a wide range of scientific disciplines.