1 Introduction

Many engineering and scientific problems require the determination of the unknown input and/or the unknown system parameters of a prescribed system from the known, but generally contaminated, output. To mention a few, inverse reconstruction of the external excitations such as the unsteady aerodynamic forces [1], the ocean wave loads [2] etc from the measured dynamic responses [3, 4] is critical to ensuring the reliability and safety of engineering structures; finite element model updating [5,6,7] that aims to correct the finite element model parameters using vibration measurements is a prerequisite for finite element analysis of real structures; the classic Cauchy problem [8, 9] that is devoted to reconstruct the field in a certain domain from the measured data on a part of the boundary typically arises in many physical applications, e.g., acoustic cavity detection, steady-state inverse heat conduction and electro-cardiology. Such problems are referred as inverse problems and are often ill-posed, that is, small contamination errors in the measurement would give rise to large perturbations (or errors) in the solution. For a more comprehensive survey on the ill-posed and inverse problems, refer to Ref. [10].

A great deal of researches have been long focused on robust and efficient solution of the ill-posed problems and a common way to circumvent the ill-posedness is to introduce additional priori knowledge on the solution; this leads to the so-called regularization methods [11]. As Benning and Burger stated at the beginning of their work [11], regularization methods are used to introduce priori knowledge and allow a robust approximation of ill-posed (pseudo-)inverses. In principle, different priori knowledge on the solution corresponds to different regularization methods and according to the classification of priori knowledge, three widely recognized classes of regularization methods have been developed in the literature,

  • the continuity/smoothing and bounding properties of the solution are naturally implicated in the \(\ell ^2\) regularization, e.g., the Tikhonov regularization [11,12,13].

  • the sparsity (or specifically the number of non-zeros) constraint of the solution is easily enforced by the \(\ell ^1\) regularization (or called the sparse regularization/compressed sensing) [14,15,16].

  • the edge-preserving and scale-dependent properties of a field solution shall be fulfilled by the total variation regularization [17].

In this paper, the class of \(\ell ^2\) regularization methods is followed up for further investigations. Such class of regularization methods tries to filter out the influence of the noise by introducing singular-value-dependent filter factors that keep approaching 1 for relatively large singular values and tend to 0 for relatively small singular values [18]. Regularization methods in this class are further categorized into [19]:

  • Direct regularization methods Well-known methods are the truncated singular value decomposition (TSVD) [20] and the Tikhonov regularization [12]. The TSVD seems simple in the analysis; however, due to the requirement for singular value decomposition (SVD) of the matrix, the computation cost involved for the TSVD is often high especially for large-scale problems. On the other hand, the Tikhonov regularization additionally introduces the \(\ell ^2\) penalty term as regularization and then, the solutions under a number of candidate regularization parameters should be obtained, each of which requires solving a linear algebraic equation. The Tikhonov regularization is often more efficient than the TSVD; however, efforts should be made to determine a proper set of candidate regularization parameters.

  • Iterative regularization methods For large-scale problems, the above direct regularization methods often involve the SVD algorithm or solving a number of algebraic equations and as a result, the computation often becomes prohibitive; this essentially calls for more efficient iterative regularization methods [21, 22]. Among the iterative regularization methods, the Landweber iteration [23] is the classical stationary method. The Landweber iteration is effective because its filter factors resemble those for the Tikhonov regularization; however, it often admits slow convergence [21]. To get faster convergence, the conjugate gradient iteration for least squares problems (CGLS) [19] is naturally invoked. The CGLS is a nonstationary method because its iteration matrix depends not only on the iteration number but also on the right-hand side and there is no analytical expressions of the filter factors. Fortunately, it was shown [21, 24] that under some mild conditions, the CGLS is indeed a regularization method. Nevertheless, the mild conditions may not be guaranteed and consequently, the CGLS may not work well in some cases [25].

A common feature of the regularization methods is that their performance depends on some regularization parameters, e.g., the truncation order of the TSVD, the Tikhonov regularization parameter and the iteration number of iterative regularization methods. By this way, the key issue in connection with the regularization methods is to find a proper regularization parameter that balances the trade-off between filtering out enough noise (or eliminating the terms corresponding to relatively small singular values) and not losing too much information (or keeping the terms corresponding to relatively large singular values) in the solution [18]. To this end, several regularization selection rules including the discrepancy principle [26], generalized cross-validation (GCV) [27] and the L-curve rule [18] have been available and well recognized. Among them, the discrepancy principle seems to be the simplest rule where the regularization parameter is selected upon the prescribed upper bound of the residual norm; however, the upper bound is often unknown and the eventual solution may be over-smoothed [13]; the GCV has been shown as a promising rule in many applications (see Ref. [28] for instance), but may not give satisfactory results if the errors in the right-hand side are highly correlated; the L-curve rule that locates the regularization parameter at the log–log scale L-corner of the residual norm versus the solution norm curve is, nowadays, a widely used rule due to its theoretically guaranteed performance [13, 18] in the class of \(\ell ^2\) regularization methods and its easy generalization to deal with other classes of regularization methods, e.g., the \(\ell ^1\) regularization [29].

In this paper, a novel iterative integration regularization method is proposed for robust solution of ill-posed problems with the \(\ell ^2\) regularization. The key idea behind is that inversion of a positive definite matrix can be alternatively obtained by integration over a exponential matrix function. Along this line, an iterative procedure is developed to compute the integral and further analysis reveals that such iterative integration procedure can indeed pose the regularization/filtering effect, indicating that the proposed iterative integration is a regularization method. The proposed method is expected to have the following features:

  • It belongs to the iterative regularization methods and therefore, shall be efficient for large-scale problems.

  • Unlike the conventional CGLS, the filter factors of the proposed method can be explicitly derived and are of the same trend to those of the Tikhonov regularization, that is, the filter factors, lying in the interval [0,1] and increasing monotonically with the singular values, tend to 1 for relatively large singular values and decay to 0 as the singular values approach 0. This to some extent guarantees the effectiveness and performance of the proposed method.

  • It is shown in the later analysis that the proposed method can degenerate to the Landweber method [23] under the linear expansion of the exponential matrix function and the linear iteration. Notwithstanding, using the exponential (\(2^j\)) iteration would accelerate the convergence speed. In this way, the Landweber method can be seen as a particular case of the proposed method, nevertheless, the proposed method can possess faster convergence.

As regards the selection of the regularization parameter (or the number of iteration) for the proposed method, the well-recognized L-curve rule is adopted. As is noteworthy, the proposed method is designated for linear ill-posed problems, of which the known right-hand side is a linear function of the unknowns; nevertheless, there is no constraint to extend the proposed method to nonlinear ill-posed problems because a nonlinear ill-posed problem is often solved iteratively by proper linearization and then, in each iteration, a linear ill-posed problem arises which can be tackled by the proposed method. The similar idea on how to solve the nonlinear ill-posed problems by regularization methods can be found in Ref. [30].

The remainder of this paper is organized as follows. In Sect. 2, the filter function is introduced to describe the behavior of \(\ell ^2\) regularization methods and then, some usual direct and iterative regularization methods are revisited from the perspective of the filter function. In Sect. 3, a new iterative integration regularization method is developed upon an integral form for inversion of a positive definite matrix. In doing so, two ways—the linear iteration and the exponential (\(2^j\)) iteration–are invoked to iteratively compute the integral and elaborate analysis on the stability and convergence of the proposed method is presented. It is eventually shown that the proposed method has the guaranteed regularization effect. Section 4 accounts for some typical ill-posed problems, by which the performance of the proposed method as well as the comparison to some usual regularization methods is specifically investigated. Final conclusions are drawn in Sect. 5.

2 Ill-posed problem and regularization

2.1 Ill-posed problem with \(\ell ^2\) regularization

Many inverse problems involve solution of the discrete linear equation

$$\begin{aligned} {\mathbf {A}} {\varvec{x}} ={\varvec{b}}, \end{aligned}$$
(1)

where the system matrix \({\mathbf {A}} \in {\mathbb {R}}^{m \times n}\) is of full column rank, i.e., \(m \ge n\) and \({\mathbf {A}}\) is injective, \({\varvec{b}} \in {\mathbb {R}}^{m}\) is the known output vector and \({\varvec{x}} \in {\mathbb {R}}^{n}\) is to be determined. Linear equation (1) is said to be ill-posed if the singular values of \({\mathbf {A}}\) cluster at zero giving rise to huge condition numbers [31]. Note that the output vector is always contaminated with noise, that is,

$$\begin{aligned} {\varvec{b}}={\varvec{b}}_\mathrm{{true}}+{\varvec{e}},{\varvec{b}}_\mathrm{{true}}={\mathbf {A}} {\varvec{x}}_\mathrm{{true}}, \end{aligned}$$
(2)

where \({\varvec{x}}_\mathrm{{true}}\) is the true solution and \({\varvec{e}}\) denotes noise. Under this circumstance, large errors into the solution \({\varvec{x}}\) would be induced by the noise \({\varvec{e}}\) for the ill-posed problem (1).

A usual way to circumvent the ill-posedness is to introduce priori knowledge on the smoothness/continuity of the solution, which involves addition of the \(\ell ^2\) penalty term \(||{\mathbf {L}}{\varvec{x}}||_2^2\) and then, the \(\ell ^2\) regularized formulation can be established for the ill-posed problem (1),

$$\begin{aligned} {\varvec{x}}_\lambda =\arg \min _{{\varvec{x}}}\left\{ ||{\mathbf {A}} {\varvec{x}} -{\varvec{b}}||_2^2+\lambda ^2 ||{\mathbf {L}}{\varvec{x}}||_2^2 \right\} , \end{aligned}$$
(3)

where the regularization matrix \({\mathbf {L}} \in {\mathbb {R}}^{l \times n}\) is often obtained through discretization of some differential operator, and \(\lambda \ge 0\) is a positive scalar called the regularization parameter. In actual, the formulation (3) is just the well-known Tikhonov regularization [12].

Usually, the regularization matrix \({\mathbf {L}}\) is of full row rank, that is, \(l \le n\) and \({\mathbf {L}}\) is surjective. For instance, to enforce the \(H^s\)-continuity/smoothness of the uniformly discretized one-dimensional solution, where \(H^s,s=0,1,2\) denotes the usual Sobolev spaces, the regularization matrix \({\mathbf {L}}\) is often defined as [32],

$$\begin{aligned} \begin{aligned}&{\mathbf {L}}=\left( \begin{array}{cccc} 1 &{} 0 &{} \cdots &{} 0 \\ 0 &{} 1 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 1\\ \end{array} \right) \in {\mathbb {R}}^{n \times n}, \text {for } s=0 ,\\&{\mathbf {L}}=\left( \begin{array}{ccccc} -1 &{} 1 &{} 0 &{} \cdots &{} 0 \\ 0 &{} -1 &{} 1 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} -1 &{} 1\\ \end{array} \right) \in {\mathbb {R}}^{(n-1) \times n}, \text {for } s=1,\\&{\mathbf {L}}=\left( \begin{array}{cccccc} 1 &{} -2 &{} 1 &{} 0 &{} \cdots &{} 0 \\ 0 &{} 1 &{} -2 &{} 1 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 1&{} -2 &{} 1\\ \end{array} \right) \in {\mathbb {R}}^{(n-2) \times n}, \text {for } s=2. \end{aligned} \end{aligned}$$
(4)

In case when \({\mathbf {L}}\) is a identity matrix \({\mathbf {I}}\), the \(\ell ^2\) regularization (3) is of the standard form; while in other cases, a simple transformation can be invoked to transform (3) into the standard form and the procedure is briefly given as follows. To arrive at the standard form, let \({\mathbf {L}}{\varvec{x}}={\varvec{y}} \in {\mathbb {R}}^l\) and then, due to the surjectivity of \({\mathbf {L}}\), the following would hold

$$\begin{aligned} {\varvec{x}}={\mathbf {L}}^+ {\varvec{y}} +{\mathbf {K}}_\mathrm{{o}} {\varvec{z}}, \end{aligned}$$
(5)

where \({\mathbf {L}}^+ \in {\mathbb {R}}^{n \times l}\) is the pseudo-inverse of \({\mathbf {L}}\), i.e., \({\mathbf {L}}{\mathbf {L}}^+={\mathbf {I}}\), \({\mathbf {K}}_\mathrm{{o}} \in {\mathbb {R}}^{n \times (n-l)}\) represents the null space of \({\mathbf {L}}\) verifying \(\mathrm{{rank}}({\mathbf {K}}_\mathrm{{o}})=n-l, {\mathbf {L}}{\mathbf {K}}_\mathrm{{o}}={\mathbf {0}}\), and \({\varvec{z}} \in {\mathbb {R}}^{n-l}\) can be arbitrary. The details on how to get the matrices \({\mathbf {L}}^+,{\mathbf {K}}_\mathrm{{o}}\) can be found in Ref. [19]. Then, substitution of Eq. (5) into the regularized formulation (3) yields

$$\begin{aligned} \min _{{\varvec{y}},{\varvec{z}}}\left\{ ||{\mathbf {A}} ({\mathbf {L}}^+ {\varvec{y}} +{\mathbf {K}}_\mathrm{{o}} {\varvec{z}}) -{\varvec{b}}||_2^2+\lambda ^2 ||{\varvec{y}}||_2^2 \right\} . \end{aligned}$$
(6)

Minimization over \({\varvec{z}}\) leads to \({\varvec{z}}=[(\mathbf {AK}_\mathrm{{o}})^\mathrm{{T}} (\mathbf {AK}_\mathrm{{o}})]^{-1}(\mathbf {AK}_\mathrm{{o}})^\mathrm{{T}} ({\varvec{b}}-{\mathbf {A}}{\mathbf {L}}^+ {\varvec{y}})\) and then, the standard form problem is obtained

$$\begin{aligned} {\varvec{y}}_\lambda =\arg \min _{{\varvec{y}}} \left\{ ||\bar{{\mathbf {A}}} {\varvec{y}} -\bar{{\varvec{b}}}||_2^2+\lambda ^2 ||{\varvec{y}}||_2^2 \right\} . \end{aligned}$$
(7)

where \(\bar{{\mathbf {A}}}={\mathbf {Q}}{\mathbf {A}}{\mathbf {L}}^+ ,\bar{{\varvec{b}}}={\mathbf {Q}} {\varvec{b}},{\mathbf {Q}}={\mathbf {I}}-(\mathbf {AK}_\mathrm{{o}})[(\mathbf {AK}_\mathrm{{o}})^\mathrm{{T}} (\mathbf {AK}_\mathrm{{o}})]^{-1}(\mathbf {AK}_\mathrm{{o}})^\mathrm{{T}}\).

Above all, the regularization problem (3) can always be transformed into the standard form and therefore, without loss of generality and for brevity, the analysis will be restricted onto the standard-form regularization problem from here on,

$$\begin{aligned} {\varvec{x}}_\lambda =\arg \min _{{\varvec{x}}}\left\{ ||{\mathbf {A}} {\varvec{x}} -{\varvec{b}}||_2^2+\lambda ^2 ||{\varvec{x}}||_2^2 \right\} . \end{aligned}$$
(8)

2.2 Regularization with filter function

To proceed further, consider the singular value decomposition (SVD) of the matrix \({\mathbf {A}}\),

$$\begin{aligned} {\mathbf {A}}={\mathbf {U}}\mathbf {\Sigma } {\mathbf {V}}^\mathrm{{T}}=\sum _{i=1}^n \sigma _i {\varvec{u}}_i {\varvec{v}}_i^\mathrm{{T}}, \end{aligned}$$
(9)

where the left and right singular vectors \({\mathbf {U}}\) and \({\mathbf {V}}\) are orthonormal, and the singular values in the diagonal matrix \(\mathbf {\Sigma }\) are nonnegative and nonincreasing, i.e., \(\sigma _1 \ge \sigma _2 \ge \cdots \ge \sigma _n > 0\). Then, direct solution of problem (1) in the least-squares sense results in

$$\begin{aligned} {\varvec{x}}_\mathrm{{LSQ}}= &\, {} \sum _{i=1}^n \frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}) }{\sigma _i}{\varvec{v}}_i=\sum _{i=1}^n\left\{ \frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}) }{\sigma _i}{\varvec{v}}_i+\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{e}}) }{\sigma _i}{\varvec{v}}_i\right\} \nonumber \\= &\, {} {\varvec{x}}_\mathrm{{true}}+\sum _{i=1}^n\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{e}}) }{\sigma _i}{\varvec{v}}_i, \end{aligned}$$
(10)

where the notations in Eq. (2) are used. Due to the ill-posedness, there exist some singular values \(\sigma _n,\sigma _{n-1},\ldots ,\sigma _{n-q}\) clustering at zero with q an integer and the contamination error corresponding to these singular values \(||{\varvec{x}}_\mathrm{{LSQ}}-{\varvec{x}}_\mathrm{{true}}||_2\ge \sqrt{\sum _{i=n-q}^n\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{e}})^2 }{\sigma _i^2}}\) is dramatically magnified and even overwhelms the true solution \({\varvec{x}}_\mathrm{{true}}\). Practical regularization methods should, therefore, introduce filter factors \(0 \le f_i \le 1\) [18] into the regularized solution,

$$\begin{aligned} {\varvec{x}}_\mathrm{{reg}}=\sum _{i=1}^n f_i\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}) }{\sigma _i}{\varvec{v}}_i. \end{aligned}$$
(11)

For \(\ell ^2\) regularization, the filter factors are often of the following forms,

$$\begin{aligned} f_i=f(\sigma _i^2,\alpha ),\alpha \in {\mathcal {A}}, \end{aligned}$$
(12)

where f() is called the filter function, \(\alpha\) denotes some regularization parameter and \({\mathcal {A}}\) is the space containing all reasonable regularization parameters. Then, the error of the regularized solution is obtained as

$$\begin{aligned} {\varvec{x}}_\mathrm{{reg}}-{\varvec{x}}_\mathrm{{true}}= \sum _{i=1}^n f_i\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{e}}) }{\sigma _i}{\varvec{v}}_i +\sum _{i=1}^n (f_i-1)\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}) }{\sigma _i}{\varvec{v}}_i. \end{aligned}$$
(13)

Aiming to minimize the error (13), several conditions for the filter function \(f(\sigma ^2,\alpha )\) should be invoked,

P1:

The portion \(\sum _{i=1}^n f_i\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{e}}) }{\sigma _i}{\varvec{v}}_i\) of the error (13) is mainly controlled by the information corresponding to relatively small singular values \(\sigma _n,\sigma _{n-1},\ldots ,\sigma _{n-q}\) and to reduce this portion of the error, \(f_i,i \in [n,n-1,\ldots ,n-q]\) should tend to zero so that \(f_i/{\sigma _i}\) is also approaching zero. In other words, there should hold,

$$\begin{aligned} f(0,\alpha )=0, \lim _{\sigma \rightarrow 0^+} \frac{f(\sigma ^2,\alpha )}{\sigma } =0, \quad \forall \alpha \in {\mathcal {A}}. \end{aligned}$$
(14)
P2:

As with the other portion of the error \(\sum _{i=1}^n (f_i-1)\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}) }{\sigma _i}{\varvec{v}}_i\), it can be divided into two parts: the first \(\sum _{i=n-q}^n (f_i-1)\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}) }{\sigma _i}{\varvec{v}}_i \approx \sum _{i=n-q}^n (-1)\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}) }{\sigma _i}{\varvec{v}}_i\) and the second \(\sum _{i=1}^{n-q-1} (f_i-1)\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}) }{\sigma _i}{\varvec{v}}_i\) where the approximation in the first part holds due to the condition P1. On the one hand, to reduce the first part of the error, the discrete Picard condition should be satisfied [13], i.e., the coefficients \(|{\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}|\) on average decay to zero faster than the singular values \(\sigma _i\), or \(|{\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}_\mathrm{{true}}|/{\sigma _i}\) tends to 0 for some \(i \in [n,n-1,\ldots ,n-q]\). On the other hand, reducing the second part of the error requires \(f_i \rightarrow 1\) for relatively large singular values. In this way, the filter function should verify,

$$\begin{aligned} \begin{aligned}&\lim _{\sigma \rightarrow +\infty } {f(\sigma ^2,\alpha )} =1, \forall \alpha \in {\mathcal {A}}, \\&\text {or weakly, } f(\sigma _1^2,\alpha )\rightarrow 1, \text {for some } \alpha . \end{aligned} \end{aligned}$$
(15)
P3:

The filter function should be able to degenerate into the original solution (10) for well-conditioned problems. To achieve so, there should exist a critical regularization parameter \(\alpha _\mathrm{{cr}}\) so that

$$\begin{aligned} f(\sigma ^2,\alpha _\mathrm{{cr}})=\lim _{\alpha \rightarrow \alpha _\mathrm{{cr}}} {f(\sigma ^2,\alpha )} =1, \forall 0<\sigma \le \sigma _1. \end{aligned}$$
(16)
P4:

The filter function should verify \(f(\sigma ^2,\alpha ) \in [0,1]\).

*P5:

In particular, if \(f(\sigma ^2,\alpha )\) is continuous with respect to both \(\sigma\) and \(\alpha\), the regularized solution (11) can be alternatively expressed by

$$\begin{aligned} {\varvec{x}}_\mathrm{{reg}}=f({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}},\alpha ) \cdot ({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}. \end{aligned}$$
(17)

Note that the condition *P5 is only optional and is not necessary in terms of the regularization performance. In what follows, some common direct and iterative \(\ell ^2\) regularization methods are briefly revisited from the viewpoint of the filter function.

2.3 Revisit of some direct and iterative regularization methods

To begin with, the Tikhonov regularization requires exact solution of Eq. (8), giving rise to

$$\begin{aligned} {\varvec{x}}_\mathrm{{Tik}}=({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}}+\lambda ^2 {\mathbf {I}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}=\sum _{i=1}^n \frac{\sigma _i^2}{\sigma _i^2+\lambda ^2}\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}) }{\sigma _i}{\varvec{v}}_i. \end{aligned}$$
(18)

In this way, the filter function for the Tikhonov regularization is found to be

$$\begin{aligned} f_\mathrm{{Tik}}(\sigma ^2,\lambda )=\frac{\sigma ^2}{\sigma ^2+\lambda ^2}, \end{aligned}$$
(19)

which obviously satisfies the four conditions P1–P5 in Sect. 2.2 with \({\mathcal {A}}=\{\lambda >0\}\) and the critical regularization parameter \(\lambda _\mathrm{{cr}}=0\). Another popular regularization method is the truncated singular value decomposition (TSVD), by which the solution is simply obtained as

$$\begin{aligned} {\varvec{x}}_\mathrm{{TSVD}}=\sum _{i=1}^k \frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}) }{\sigma _i}{\varvec{v}}_i, \quad k \le n, \end{aligned}$$
(20)

where the truncation order k is just the regularization parameter and then, the filter function for the TSVD turns out to be

$$\begin{aligned} f_\mathrm{{TSVD}}(\sigma ^2,k)=\left\{ \begin{array}{ll} 0,& \text {if } \sigma < \sigma _{t(k,n)} \\ 1,& \text {otherwise}, \end{array} \right. \end{aligned}$$
(21)

where \(t(k,n)=\min ([k],n)\) with [k] denoting the integer part of k and the regularization parameter space is \({\mathcal {A}}=\{k \ge 1\}\). The filter function \(f_\mathrm{{TSVD}}(\sigma ^2,k)\) for the TSVD clearly satisfies the conditions P1–P4 with the critical regularization parameter \(k_\mathrm{{cr}}=n\), but does not verify the optional condition *P5. That is to say, the TSVD solution can not be obtained by simple multiplication or inversion on matrix \({\mathbf {A}}\), rather the singular value decomposition (SVD) algorithm must be called.

The above two regularization methods—the Tikhonov regularization and the TSVD—belong to the direct regularization methods. As it is well known, the SVD algorithm [33] which is often computationally expensive is required for the TSVD; while for the Tikhonov regularization, the inversion \(({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}}+\lambda ^2 {\mathbf {I}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\) should be computed under a number of candidate regularization parameters. In total, the computation cost for the direct regularization methods is found expensive, especially for large-scale problems. This essentially calls for more efficient iterative regularization methods [34].

The Landweber method [23] is the classical stationary iterative regularization method and its procedure is simply described as follows:

  • Landweber method

    • set the initial vector \({\varvec{x}}_0={\varvec{0}}\),

    • do the stationary iteration to get the solution

      $$\begin{aligned} {\varvec{x}}_k={\varvec{x}}_{k-1}+\tau {\mathbf {A}}^\mathrm{{T}} ({\varvec{b}}- {\mathbf {A}}{\varvec{x}}_{k-1}), \quad k=1,2,3,\ldots , \end{aligned}$$
      (22)

      where \(\tau\) is a real positive constant and controls the behavior of the iteration.

To analyze the filter function for the Landweber method, insert the SVD of \({\mathbf {A}}\) into Eq. (22) and then, the kth iteration is found to be

$$\begin{aligned} {\varvec{x}}_\mathrm{{Land}}=\sum _{i=1}^n (1-(1-\tau \sigma _i^2)^k)\frac{ ({\varvec{u}}_i^\mathrm{{T}} {\varvec{b}}) }{\sigma _i}{\varvec{v}}_i \end{aligned}$$
(23)

from which the convergence of the Landweber method requires \(0<\tau <2/{\sigma _1^2}\). It is easily seen from Eq. (23) that the filter function for the Landweber method is

$$\begin{aligned} f_\mathrm{{Land}}(\sigma ^2,k)=1-(1-\tau \sigma ^2)^k, \end{aligned}$$
(24)

where the regularization parameter as the iteration number verifies \(k \in {\mathcal {A}}=\{k\ge 1\}\). Obviously, this filter function (23) satisfies the five conditions P1–*P5 with the critical regularization parameter \(k_\mathrm{{cr}}=+\infty\). In practice, \(\sigma _1\) can be well and easily estimated, e.g., through the subspace iteration, and then, a proper \(\tau\) can be determined. Note that during the Landweber iteration (22), only matrix–vector multiplication is involved and thus, the Landweber method would be very efficient for large-scale problems if the number of iteration k is reasonably finite. However, the Landweber method often converges slowly [21] so that k may be a large number. To further accelerate the convergence, the conjugate gradient (CG) algorithm to solve the least-squares equation \({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}}{\varvec{x}}={\mathbf {A}}^\mathrm{{T}} {\varvec{b}}\) is naturally invoked, which is designated as the CGLS:

  • Conjugate gradient algorithm for least squares problem (CGLS)

    • Let \({\varvec{x}}_0\) denote the initial vector that often equals to \({\varvec{0}}\), and define two auxiliary vectors \({\varvec{r}}_0={\varvec{b}}- {\mathbf {A}}{\varvec{x}}_{0}\) and \({\varvec{d}}_0={\mathbf {A}}^\mathrm{{T}}{\varvec{r}}_0\),

    • do the CG iteration for \(k=1,2,\ldots\)

      $$\begin{aligned} \begin{aligned}&\alpha _k=\frac{||{\mathbf {A}}^\mathrm{{T}} {\varvec{r}}_{k-1}||_2^2}{||{\mathbf {A}} {\varvec{d}}_{k-1}||_2^2},\\&{\varvec{x}}_k={\varvec{x}}_{k-1}+\alpha _k {\varvec{d}}_{k-1}, \\&{\varvec{r}}_k={\varvec{r}}_{k-1}-\alpha _k {\mathbf {A}} {\varvec{d}}_{k-1}, \\&\beta _k=\frac{||{\mathbf {A}}^\mathrm{{T}} {\varvec{r}}_{k}||_2^2}{||{\mathbf {A}}^\mathrm{{T}} {\varvec{r}}_{k-1}||_2^2},\\&{\varvec{d}}_k={\mathbf {A}}^\mathrm{{T}} {\varvec{r}}_k+\beta _k {\varvec{d}}_{k-1}. \end{aligned} \end{aligned}$$
      (25)

The CGLS is a non-stationary method and the filter function is of the following form [35],

$$\begin{aligned} f_\mathrm{{CGLS}}(\sigma ^2,k)=1-\Pi _{j=1}^k \frac{(\theta _j^{(k)})^2-\sigma ^2}{(\theta _j^{(k)})^2},\quad k\le n, \end{aligned}$$
(26)

where \((\theta _j^{(k)})^2\) are the k eigenvalues of \({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}}\) restricted to the Krylov subspace \({\mathcal {K}}_k({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}},{\mathbf {A}}^\mathrm{{T}}{\varvec{b}})=\text {span}({\mathbf {A}}^\mathrm{{T}}{\varvec{b}},({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}}){\mathbf {A}}^\mathrm{{T}}{\varvec{b}},\ldots ,({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}})^{k-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}})\). The CGLS often possesses faster convergence (or less k) than the Landweber method. As k increases, the eigenvalues converge to some of the eigenvalues of \({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}}\), i.e., \(\theta _j^{(k)} \approx \sigma _i\) for selected ij. By Eq. (26), the filter function for the CGLS is also dependent on the right-hand side \({\varvec{b}}\). Turning to the conditions in Sect. 2.2, \(f_\mathrm{{CGLS}}(\sigma ^2,k)\) obviously verifies conditions P1, P2, *P5 with \(k_\mathrm{{cr}}=n\) and approximately satisfies condition P3, but does not verify the condition P4 because \(f_\mathrm{{CGLS}}(\sigma ^2,k)\) may be greater than 1 [25]. As a consequence, the regularization effect of the CGLS is not always guaranteed [25].

In the next section, a new iterative integration regularization method is to be developed. It is iterative and stationary and would, therefore, be efficient and possess the guaranteed regularization effect.

3 The iterative integration regularization method

3.1 Inversion by iterative integration

To start with, an alternative integral formulation for inversion of the matrix \({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}}\) is presented in the following theorem.

Theorem 1

Let \({\mathbf {A}} \in {\mathbb {R}}^{m \times n}\) be a matrix with full column rank, i.e., \(\mathrm{{rank}}({\mathbf {A}})=n\) and then, an integral form for inversion of \({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}}\) holds true,

$$\begin{aligned} ({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}=\int _{0}^{+\infty } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}\mathrm{{d}}t. \end{aligned}$$
(27)

Proof

Considering the full column rank property of \({\mathbf {A}}\) and the SVD as in equation (9), it is easily obtained that

$$\begin{aligned} ({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})={\mathbf {V}} \mathbf {\Sigma }^2 {\mathbf {V}}^\mathrm{{T}} \end{aligned}$$
(28)

where \(\mathbf {V,\Sigma }\) are as illustrated below (9). By Eq. (28), one can get

$$\begin{aligned} \int _{0}^{+\infty } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}\mathrm{{d}}t={\mathbf {V}} \int _{0}^{+\infty } e^{-\mathbf {\Sigma }^2t}\mathrm{{d}}t {\mathbf {V}}^\mathrm{{T}}={\mathbf {V}} \mathbf {\Sigma }^{-2} {\mathbf {V}}^\mathrm{{T}} \end{aligned}$$
(29)

from which the equality (27) follows trivially.

Now turn to the ill-posed problem (1), and the least-squares solution requires solving \({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}}{\varvec{x}}={\mathbf {A}}^\mathrm{{T}} {\varvec{b}}\). By Theorem 1, the solution is found to be

$$\begin{aligned} {\varvec{x}}=\int _{0}^{+\infty } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\mathrm{{d}}t. \end{aligned}$$
(30)

In this way, the solution shall be approximately obtained by introducing some finite (and often large) number \(T_0>0\),

$$\begin{aligned} {\varvec{x}}_\mathrm{{int}}=\int _{0}^{T_0} e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\mathrm{{d}}t. \end{aligned}$$
(31)

and how to practically compute the integral (31) in an iterative manner will be elaborated below.

Let \(\tau >0\) be small enough and then, by Taylor expansion, it is obtained that

$$\begin{aligned} e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})\tau }\approx {\mathbf {B}}_{p,\tau }:= &\, {} {\mathbf {I}}-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})\tau +\frac{({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^2\tau ^2}{2!}\nonumber \\&-\frac{({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^3\tau ^3}{3!}+\cdots +(-1)^p\frac{({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^p\tau ^p}{p!} \end{aligned}$$
(32)

and then,

$$\begin{aligned} \begin{aligned} \int _0^{\tau } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}\mathrm{{d}}t&=({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}({\mathbf {I}}-e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})\tau })\\&\approx {\mathbf {C}}_{p,\tau }:= ({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}({\mathbf {I}}-{\mathbf {B}}_{p,\tau })\\&=\tau {\mathbf {I}}-\frac{({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})\tau ^2}{2!}+\frac{({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^2\tau ^3}{3!}\\&\quad -\frac{({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^3\tau ^4}{4!}+\cdots +(-1)^{p-1}\frac{({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{p-1}\tau ^{p}}{p!}, \end{aligned} \end{aligned}$$
(33)

where p denotes the truncation order of Taylor expansion. To go further, set

$$\begin{aligned} {\varvec{x}}_{k}=\int _{0}^{k\tau } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\mathrm{{d}}t \end{aligned}$$
(34)

and then, two ways can be invoked to get the solution iteratively with the help of the two approximations (32) and (33):

  • Iterative integration-linear(IIL) At first, a simple equality is obtained,

    $$\begin{aligned} {\varvec{x}}_{k}=({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1} ({\mathbf {I}}-e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})k\tau }){\mathbf {A}}^\mathrm{{T}}{\varvec{b}} \Rightarrow e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})k\tau }{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}={\mathbf {A}}^\mathrm{{T}}{\varvec{b}}-{\mathbf {A}}^\mathrm{{T}} {\mathbf {A}}{\varvec{x}}_{k}. \end{aligned}$$
    (35)

    Then, by the equality (35), the iterative procedure to get the solution (34) can be established with \({\varvec{x}}_0={\varvec{0}}\) and for \(k=1,2,\ldots\)

    $$\begin{aligned} \begin{aligned} {\varvec{x}}_{k}&={\varvec{x}}_{k-1}+\int _{(k-1)\tau }^{k\tau } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\mathrm{{d}}t \\&={\varvec{x}}_{k-1}+\int _{0}^{\tau } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}\mathrm{{d}}t \cdot e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})(k-1)\tau }{\mathbf {A}}^\mathrm{{T}}{\varvec{b}} \\&= {\varvec{x}}_{k-1}+\int _{0}^{\tau } e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}\mathrm{{d}}t \cdot ({\mathbf {A}}^\mathrm{{T}}{\varvec{b}}-{\mathbf {A}}^\mathrm{{T}} {\mathbf {A}}{\varvec{x}}_{k-1}) \\&= {\varvec{x}}_{k-1}+{\mathbf {C}}_{p,\tau } {\mathbf {A}}^\mathrm{{T}} ({\varvec{b}}-{\mathbf {A}}{\varvec{x}}_{k-1}), \end{aligned} \end{aligned}$$
    (36)

    where the matrix \({\mathbf {C}}_{p,\tau }\) is defined in Eq. (33), \(p\ge 1\) is an integer and \(\tau\) is a proper parameter; both p and \(\tau\) govern the performance of the iteration. The IIL is named after the fact that the number \(T_0=k\tau\) in the iteration (36) increases linearly with the number of iterations.

  • Iterative integration-exponential (IIE) Distinct to the linear iteration in IIL, the number \(T_0=2^j\tau\) is expected to increase exponentially herein. This follows the \(2^j\) algorithm that has been well used in the precise integration method [36]. To do so, set \({\mathbf {T}}_j=e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})2^j\tau }-{\mathbf {I}}\) and simple analysis gives rise to

    $$\begin{aligned} {\varvec{x}}_{2^j}=({\mathbf {I}}+e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})2^{j-1}\tau })\int _0^{2^{j-1}\tau }e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})t}\mathrm{{d}}t{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}. \end{aligned}$$
    (37)

    Then, the iterative procedure is given as: initially set \({\mathbf {T}}_0={\mathbf {B}}_{p,\tau }-{\mathbf {I}}\) and \({\varvec{x}}_{2^0}={\mathbf {C}}_{p,\tau }{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\), and then, proceed for \(j=1,2,\ldots\),

    $$\begin{aligned} \begin{aligned}&{\varvec{x}}_{2^j}=(2{\mathbf {I}}+{\mathbf {T}}_{j-1}){\varvec{x}}_{2^{j-1}}, \\&{\mathbf {T}}_j=2{\mathbf {T}}_{j-1}+{\mathbf {T}}_{j-1} \cdot {\mathbf {T}}_{j-1}. \end{aligned} \end{aligned}$$
    (38)

Note that both the IIL and the IIE are stationary and are derived as approximations of (34), and their performance is dependent on two parameters \(p,\tau\). In the following theorem, the condition on \(p,\tau\) such that the two iterations—the IIL (36) and the IIE (38) give convergent solutions to \(({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\) is presented.

Theorem 2

If \(p \ge 1,\tau >0\) are selected to verify

$$\begin{aligned} \rho ({\mathbf {B}}_{p,\tau })<1 \end{aligned}$$
(39)

with \({\mathbf {B}}_{p,\tau }\) defined in Eq. (32) and \(\rho (\cdot )\) denoting the spectral radius of a matrix, the two iterationsthe IIL (36) and the IIE (38) would give convergent solutions to \(({\mathbf {A}}^\mathrm{{T}}{\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\).

Proof

At first, it is easily found from Eq. (33) that

$$\begin{aligned} \begin{aligned} {\mathbf {B}}_{p,\tau }={\mathbf {I}}-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}}){\mathbf {C}}_{p,\tau }, \end{aligned} \end{aligned}$$
(40)

where \(p \ge 1\) is an integer. Then, further analysis on the IIL (36) yields

$$\begin{aligned} \begin{aligned} {\varvec{x}}_{k}&= {\mathbf {C}}_{p,\tau } {\mathbf {A}}^\mathrm{{T}}{\varvec{b}}+{\mathbf {B}}_{p,\tau } {\varvec{x}}_{k-1} \\&= [{\mathbf {I}}+{\mathbf {B}}_{p,\tau }+\cdots + ({\mathbf {B}}_{p,\tau })^{k-1}] {\mathbf {C}}_{p,\tau } {\mathbf {A}}^\mathrm{{T}}{\varvec{b}} \\&= ({\mathbf {I}}-({\mathbf {B}}_{p,\tau })^{k})({\mathbf {I}} -{\mathbf {B}}_{p,\tau })^{-1}{\mathbf {C}}_{p,\tau } {\mathbf {A}}^\mathrm{{T}}{\varvec{b}} \\&= ({\mathbf {I}}-({\mathbf {B}}_{p,\tau })^{k})({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}} \end{aligned} \end{aligned}$$
(41)

and under the condition (39), it is naturally deduced that

$$\begin{aligned} \begin{aligned} \lim _{k\rightarrow +\infty } ({\mathbf {I}}-({\mathbf {B}}_{p,\tau })^{k})({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}=({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}. \end{aligned} \end{aligned}$$
(42)

On the other hand, for the IIE (38), simple manipulations give rise to

$$\begin{aligned} \begin{aligned} {\mathbf {I}}+{\mathbf {T}}_j&=({\mathbf {I}}+{\mathbf {T}}_{j-1})^2=({\mathbf {I}} +{\mathbf {T}}_{0})^{2^j}=({\mathbf {B}}_{p,\tau })^{2^j},\\ {\varvec{x}}_{2^j}&=({\mathbf {I}}+({\mathbf {B}}_{p,\tau })^{2^{j-1}}){\varvec{x}}_{2^{j-1}} =[\Pi _{s=0}^{j-1} ({\mathbf {I}}+({\mathbf {B}}_{p,\tau })^{2^{s}})]{\varvec{x}}_{2^0} \\&=[{\mathbf {I}}+({\mathbf {B}}_{p,\tau })+({\mathbf {B}}_{p,\tau })^2 +\cdots + ({\mathbf {B}}_{p,\tau })^{2^{j}-1}]{\mathbf {C}}_{p,\tau }{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\\&=({\mathbf {I}}-({\mathbf {B}}_{p,\tau })^{2^j})({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}} \end{aligned} \end{aligned}$$
(43)

which certainly tends to \(({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}}\) as \(j \rightarrow +\infty\) under the condition (39).

Remark 1

Paying attention to the IIL (36), if the linear Taylor truncation order is adopted in Eqs. (32) or (33), i.e., \(p=1\), there is \({\mathbf {C}}_{p,\tau }=\tau {\mathbf {I}}\) and then, the IIL (36) turns out to be \({\varvec{x}}_{k}={\varvec{x}}_{k-1}+\tau {\mathbf {A}}^\mathrm{{T}} ({\varvec{b}}-{\mathbf {A}}{\varvec{x}}_{k-1})\), which coincides perfectly with the Landweber method (22). Clearly, the proposed iterative integration method degenerates to the Landweber method under the linear Taylor expansion and the linear iteration. In other words, the proposed method can to some extent viewed as the extension of the Landweber method.

Remark 2

Comparing (41) with (43), the final results by the IIL and the IIE are of the same form, except that the ways to do iterations are different: the IIL is the linear (k) iteration, while the IIE is the exponential (\(2^j\)) iteration. In a single iteration, the computation cost for the IIL is much less than that for the IIE, because the IIL only involves the matrix–vector multiplication, while the IIE requires the matrix–matrix multiplication. However, the IIE converges much faster than the IIL especially for the large-scale ill-posed problems. Specifically, if the iteration number k is finite and quite less than n, the IIL would be much efficient than the IIE; while if too many iterations are required (\(k > n \ln k\)), the IIE is often preferred.

Remark 3

Focusing on the condition (39), it is easily obtained that for \(p\ge 0\)

$$\begin{aligned} \begin{aligned} \rho ({\mathbf {B}}_{p,\tau })&=\max \{|g_p(\tau \sigma ^2)|,\sigma \in \{\sigma _1,\sigma _2,\ldots ,\sigma _n\}\}; \\&g_p(t)=1-t+\frac{t^2}{2!}+\cdots + (-1)^p\frac{t^p}{p!}, \end{aligned} \end{aligned}$$
(44)

where \(g_p(t)\) is the pth order Taylor expansion of \(e^{-t}\). To go further, set \(r_p(t)=e^{-t}-g_p(t)\) and the following hold

$$\begin{aligned} r_p(0)=0,r'_p(t)=-r_{p-1}(t). \end{aligned}$$
(45)

Then, for \(t\ge 0\), one can have \(r_0(t) \le 0\) and then, recursively for \(k=1,2,\ldots\)

$$\begin{aligned} \begin{aligned}&r_{2k-2}(t) \le 0; \\&r'_{2k-1}(t)=-r_{2k-2}(t) \ge 0 \Rightarrow r_{2k-1}(t) \ge 0; \\&r'_{2k}(t)=-r_{2k-1}(t) \le 0 \Rightarrow r_{2k}(t) \le 0. \end{aligned} \end{aligned}$$
(46)

This gives rise to,

$$\begin{aligned} g_{2k-1}(t) \le e^{-t} \le g_{2k}(t), g_{2k}(t) > 0. \end{aligned}$$
(47)

Based on the results in Eq. (47), there are

$$\begin{aligned} g'_{2k+1}(t)=-g_{2k}(t) < 0; g''_{2k+2}(t)=g_{2k}(t) > 0 \end{aligned}$$
(48)

indicating that for \(t \in [0,\tau \sigma _1^2]\), \(g_{2k-1}(t)\) is a monotonically decreasing function and \(g_{2k}(t)\) is a strictly convex function. Along with the fact \(g_{2k-1}(0)=g_{2k}(0)=1\) and Eqs. (44) and (48), the condition (39) turns into

$$\begin{aligned} |g_p(\tau \sigma _1^2)|<1 \Leftrightarrow \left\{ \begin{array}{ll} g_p(\tau \sigma _1^2)>-1,& \text {if }p=2k-1, \\ g_p(\tau \sigma _1^2)<1,& \text { if } p=2k. \end{array} \right. \end{aligned}$$
(49)

For brevity, let the real solution of \(g_p(t)=z,t>0\) under some proper z be denoted by \(s_{p,z}\) and for a schematic view of these solutions, refer to Fig. 1. Consequently, the condition (39) is simplified to

$$\begin{aligned} 0< \tau < \tau _p^\mathrm{{Upper}}:=\left\{ \begin{array}{ll} \frac{s_{p,-1}}{\sigma _1^2},& \text {for}\;p\;\text {odd}, \\ \frac{s_{p,1}}{\sigma _1^2},& \text {for}\;p\;\text {even}, \end{array} \right. \end{aligned}$$
(50)

and the greater p is, the larger upper bound \(\tau _p^\mathrm{{Upper}}\) would be. In this paper, it is set \(\tau =0.8\tau _p^\mathrm{{Upper}}\).

Fig. 1
figure 1

Schematic of function \(g_p(t)\)

Up till now, the iterative integration method with two variants—the IIL and the IIE—has been developed for inverse solution of the ill-posed problem (1). In what follows, the iterative integration method will be shown to pose the regularization effect.

3.2 Regularization effect

This section starts with the ideal case (31) of the integration method with \(T_0=k\tau\),

$$\begin{aligned} {\varvec{x}}_\mathrm{{int}}=[{\mathbf {I}}- e^{-({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})k\tau }]({\mathbf {A}}^\mathrm{{T}} {\mathbf {A}})^{-1}{\mathbf {A}}^\mathrm{{T}}{\varvec{b}} \end{aligned}$$
(51)

and then, the filter function for the ideal case is found to be

$$\begin{aligned} f_\mathrm{{int}}(\sigma ^2,k)=1-e^{-\sigma ^2k\tau }. \end{aligned}$$
(52)

While turning to equations (41) and (43), the filter functions for the two practical variants of the iterative integration method—the IIL and the IIE are

$$\begin{aligned} \begin{aligned}&f_\mathrm{{IIL}}(\sigma ^2,k)=1-(g_p(\tau \sigma ^2))^k,\\&f_\mathrm{{IIE}}(\sigma ^2,j)=1-(g_p(\tau \sigma ^2))^{2^j}, \end{aligned} \end{aligned}$$
(53)

where the function \(g_p(\cdot )\) has been defined in Eq. (44). Clearly, \(f_{int}(\sigma ^2,k)\) satisfies the five conditions in Sect. 2.2, indicating that the ideal case is indeed a regularization method.

As with the two practical variants—the IIL and the IIE, they have nearly the same filter functions, except for the different exponentials k and \(2^j\). Thus, without loss of generality, the focus is on the analysis of the filter function \(f_\mathrm{{IIL}}(\sigma ^2,k)\) of the IIL. At first, considering the fact that \(g_p(0)=1,g'_p(0)=-1\), there is

$$\begin{aligned} f_\mathrm{{IIL}}(0,k)=0,\lim _{\sigma \rightarrow 0} \frac{f_\mathrm{{IIL}}(\sigma ^2,\alpha )}{\sigma }=\lim _{\sigma \rightarrow 0} \frac{-k(g_p(\tau \sigma ^2))^{k-1}g'_p(\tau \sigma ^2)2\tau \sigma }{1}=0 \end{aligned}$$
(54)

meaning that the condition P1 is satisfied. Secondly, under the condition (50), there is \(|g_p(\tau \sigma ^2)|<1,\forall \sigma \in [\sigma _n,\sigma _1]\) and then, the condition P2 is verified under some large k. Other conditions P3, P4, *P5 are obviously fulfilled by \(f_\mathrm{{IIL}}(\sigma ^2,\alpha )\) with the critical parameter \(k_\mathrm{{cr}}=+\infty\). To conclude, the iterative integration method with two variants has been shown to satisfy the five conditions in Sect. 2.2 and as a result, it is a strict regularization method.

At the closure of this sub-section, the filter functions for the six regularization methods involved in this work—the Tikhonov regularization, the TSVD, the Landweber method, the CGLS, and the proposed IIL and IIE are summarized in Table 1 for comparison. In the next section, how to select the regularization parameter for the iterative integration method is elaborated.

Table 1 The filter functions for various regularization methods

3.3 Selection of regularization parameter using the L-curve

In the above, the iterative integration method has been shown to pose the regularization effect and the iteration number k for the IIL (resp. j for the IIE) is just the regularization parameter, that remains to be determined.

The L-curve has been well recognized as an effective way to select the regularization parameter [18]. To do so, for the IIL, the norm of the regularized solution and the norm of the residual should be calculated at each iteration k,

$$\begin{aligned} \rho ^2(k)=||{\mathbf {A}}{\varvec{x}}_k-{\varvec{b}}||_2^2,\eta ^2(k)=||{\varvec{x}}_k||_2^2. \end{aligned}$$
(55)

Then, plot the log–log scale curve

$$\begin{aligned} (\log \rho (k),\log \eta (k)) \end{aligned}$$
(56)

and the optimal k is chosen at the L-corner of the curve. As with how to find the L-corner, the cubic spline curve fitting [18] or the triangle method [37] can be used. The above procedure can work for the IIE by simply replacing \({\varvec{x}}_k\) in (55) with \({\varvec{x}}_{2^j}\). Note that the iteration should go a few iterations too far from the optimal choice in order to determine the L-corner of the curve.

4 Numerical examples and applications

In this section, three typical ill-posed problems arising from some real applications are investigated to see the effectiveness and efficiency of the proposed iterative integration regularization method including the IIL and the IIE. Other regularization methods as listed in Table 1 will also be considered for comparison. The root-mean-square (RMS) noise is added to the data as follows

$$\begin{aligned} {\varvec{b}}={\varvec{b}}_\mathrm{{true}}+e \cdot RMS({\varvec{b}}_\mathrm{{true}})\cdot {\varvec{r}}, \end{aligned}$$
(57)

where \(\mathrm{{RMS}}({\varvec{b}})=\sqrt{(b_1^2+b_2^2+\cdots +b_m^2)/m}\), e is the noise level (e.g., = 1%), and \({\varvec{r}}\) is a random vector having the same size with \({\varvec{b}}\) and pertaining to the standard normal distribution.

4.1 Fredholm integral equation of the first kind

Consider the Fredholm integral equation of the first kind [38]:

$$\begin{aligned} \int _a^b K(s,t)f(t)\mathrm{{d}}t=g(s), c \le s \le d, \end{aligned}$$
(58)

where K(st) is the kernel function, \(g(s), c \le s \le d\) is the known right-hand side and \(f(t),a\le t \le b\) is the unknown solution remaining to be determined. The Fredholm integral equation is widely encountered in many applications, including image reconstruction [39], inverse Laplace transformation and so on. Usually, this equation is solved through the following discretization: let \(a=t_1<t_2<\cdots<t_n=b, c=s_1<s_2 < \cdots <s_m=d\) be the (usually uniform) discretization of the intervals [ab] and [cd]. Then, set \({\varvec{x}}=[f(t_1); \; f(t_2); \ldots; \;f(t_n)]\) and \({\varvec{b}}=[g(s_1); \; g(s_2);\ldots;\;g(s_m)]\), and by the Trapezoid integration rule, a discrete equation of the form (1) is naturally invoked for the Fredholm integral Eq. (58) with

$$\begin{aligned} {\mathbf {A}}= &\, {} \left( a_{ij}:=K(s_i,t_j)\Delta t_j\right) _{m \times n},\Delta t_j\nonumber \\= &\, {} \left\{ \begin{array}{ll} (t_2-t_1)/2,& j=1, \\ (t_{j+1}-t_{j-1})/2,& j=2,\ldots ,n-1, \\ (t_n-t_{n-1})/2, &j=n. \end{array} \right. \end{aligned}$$
(59)

Herein, the particular case is a one-dimensional problem in image reconstruction [13] where the image is blurred by a known point-spread function. The kernel function is the point-spread function of an infinitely long slit, given by

$$\begin{aligned} K(s,t)=[(\cos s+ \cos t)\cdot \text {sinc}(\sin s + \sin t)]^2, -\pi /2 \le s \le \pi /2 \end{aligned}$$
(60)

with \(\text {sinc}(x)=\frac{\sin \pi x}{\pi x}\), while the desired solution is set to

$$\begin{aligned} f(t)=\exp (-4(t-0.5)^2)+\exp (-4(t+0.5)^2), -\pi /2 \le t \le \pi /2. \end{aligned}$$
(61)

For discretization, \(m=n=64\) equally spaced points in \([-\pi /2,\pi /2]\) are used for both t and s and then, the system matrix \({\mathbf {A}}\) is obtained as in Eq. (59). To see more into the system matrix \({\mathbf {A}}\), the singular values of \({\mathbf {A}}\) are depicted in Fig. 2. As it is observed, there are many singular values being less than \(1\times 10^{-15}\), that is, clustering at zero. The condition number as the ratio of the largest singular value to the smallest \(\sigma _1/\sigma _n=4.6 \times 10^{16}\) is huge, indicating that the problem is indeed ill-posed. In this example, the true solution \({\varvec{x}}_\mathrm{{true}}\) can be obtained based on (61) and then, the right-hand side is obtained by adding noise with level \(e=1\%\) to the true one \({\varvec{b}}_\mathrm{{true}}={\mathbf {A}}{\varvec{x}}_\mathrm{{true}}\) in the form of (57). For simplicity, it is taken \(\mathbf {L=I}\), the identity matrix in the regularization formulation (3).

Fig. 2
figure 2

Schematic of the singular values of the system matrix for Fredholm integral equation problem

The six regularization methods abbreviated as listed in Table 1 are adopted to solve the ill-posed problem. Note that the Taylor truncation order for the proposed two variants of the iterative integration method—the IIL and the IIE, is fixed at \(p=3\) (refer to Eq. (32)) at first. The L-curve rule is used to select a proper regularization parameter and the detailed L-curves for the six regularization methods are displayed in Fig. 3 where the selected regularization parameters at the L-corner are highlighted in red. As it is seen, among the iterative regularization methods, both the Landweber method and the IIL (\(p=3\)) require over 2500 iterations to reach the L-corner, while the CGLS and the IIE (\(p=3\)) arrives at the L-corner within only 12 iterations. Note that the Landweber method is a particular case of the IIL but with Taylor truncation order being \(p=1\) and the slow convergence of the IIL for this ill-posed problem is observed. In contrast, fast convergence is seen by the CGLS and the IIE (\(p=3\)). Except for the Landweber method and the IIL (\(p=3\)), the L-shape of the residual norm vs. the solution norm curve for other regularization methods is quite clear and the selection at the L-corner seems straightforward.

Fig. 3
figure 3

L-curves of different regularization methods for Fredholm integral equation problem: a Tik, b TSVD, c Land, d CGLS, e IIL (\(p=3\)) and f IIE (\(p=3\))

With the regularization parameters selected at the L-corner, the solutions by the six regularization methods are obtained as graphically shown in Fig. 4. Clearly, all the six regularization methods have led to satisfactory solution of the Fredholm integral equation. To further quantify the performance of the six regularization methods for the Fredholm integral equation, the relative error (RE) is calculated as

$$\begin{aligned} \text {RE}=\frac{||{\varvec{x}}_\mathrm{{reg}}-{\varvec{x}}_\mathrm{{true}}||_2}{||{\varvec{x}}_\mathrm{{true}}||_2} \times 100\%, \end{aligned}$$
(62)

with \({\varvec{x}}_\mathrm{{reg}}\) denoting the solution by some regularization method, and the computation time is also counted; detailed results are presented in Table 2. The relative errors for the six regularization methods approaching to each other are all less than 8%, meaning that this Fredholm integral equation is well solved. Among them, the proposed IIE (\(p=3\)) gives rise to least relative error (RE = 5.88%); the effectiveness of the proposed iterative integration regularization method (including the IIE and the IIL) for the ill-posed Fredholm integral equation is verified. As with the computation time, the cost for the six regularization methods is at the same cheap level due to the fact the system matrix \({\mathbf {A}}\) is only of small size \(m=n=64\), nonetheless, the iterative methods including the CGLS and the IIE turn out to be slightly more efficient. To further visual the regularization effect of the six regularization methods, the filter factors (see Table 1) under different singular values are plotted in Fig. 5. Except for the CGLS, other five methods share the same trend of the filter factors: the filter factors tend to zero for relatively small singular values (\(\sigma _i \le 1\times 10^{-3}\)) to filter out undesired noise effect, the filter factors tend to one for relatively large singular values (\(\sigma _j > 0.1\)) to keep the terms of the true solution and the filter factors lie almost monotonically between [0, 1]. Particular attention should be paid to the filter factors of the CGLS where the filter factor reaches 1.8 at the 4th singular value, being undesirably greater than 1 and this indicates that the CGLS is not a strict regularization method. Despite of the undesirable trend in the filter function of the CGLS, the CGLS still performs well for this Fredholm integral equation and the reason may reside at that the term corresponding to the 4th singular value is negligible in the true solution.

Fig. 4
figure 4

Plot of the solutions by the six regularization methods for Fredholm integral equation problem

Table 2 Results on relative error (RE), selected regularization parameter at the L-corner (Reg-Para) and time consumption (in second) by the six regularization methods for Fredholm integral equation problem
Fig. 5
figure 5

Filter factors of the six regularization methods under different singular values for Fredholm integral equation problem

To conclude, for the proposed iterative integration regularization method, the IIL admits slow convergence, while fast convergence is often reached by the IIE. In this way, the exponential iteration (IIE) is often preferred to carry out the proposed integration regularization. The IIE is found to give the solution with least relative error, and due to iterative nature, the IIE and the CGLS are slightly more efficient in this example than the direct regularization methods including the Tikhonov regularization and the TSVD. Also noteworthy is that the regularization effect is guaranteed for the IIE, but may be deviated for the CGLS.

In the above, the IIE is shown to perform well under the prescribed Taylor truncation order \(p=3\). At the closure of this example, the dependence of the performance of the IIE on different Taylor truncation orders p is explored. The detailed results are summarized in Table 3. It is observed that under different truncation orders p, the performance is similar: the relative error is within [5.86%, 6.98%] meaning that all cases give satisfactory solution to the Fredholm integral equation, the time for computation is all near 0.20 s and the number of iterations to arrive at the L-corner is also no more than 13; this is to say, the effect of the order p on the performance of the IIE is not-so-obvious in this example.

Table 3 Results on relative error (RE), time consumption (in second) and the number of iterations at L-corner by the IIE with different truncation orders p for Fredholm integral equation problem

4.2 Dynamic force reconstruction of a shear structure

Fig. 6
figure 6

A ten-storey shear model with dynamic force exerted on the first floor

Dynamic force reconstruction of a ten-storey shear structure from the measured acceleration data is considered in this example. The shear structure model is presented in Fig. 6 and the model parameters are taken as: the shear stiffness of every storey \(k_1=k_2= \cdots =k_{10}=1500\) kN/m and the mass of every floor \(m_1=m_2= \cdots =m_{10}=3500\) kg. In this way, the stiffness matrix \({\mathbf {K}}\) and the mass matrix \({\mathbf {M}}\) are obtained as

$$\begin{aligned} {\mathbf {M}}=\left( \begin{array}{cccc} m_1 &{} 0 &{} \cdots &{} 0 \\ 0 &{} m_2 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} m_{10} \\ \end{array} \right) , {\mathbf {K}}=\left( \begin{array}{cccc} k_1+k_2 &{} -k_2 &{} \cdots &{} 0 \\ -k_2 &{} k_2+k_3 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} k_{10} \\ \end{array} \right) . \end{aligned}$$

The Rayleigh damping \({\mathbf {C}}=a_1{\mathbf {M}}+a_2{\mathbf {K}}\) is assumed with the first two damping ratios being 1% so that there is \(a_1=0.0463\) s\(^{-1}\) and \(a_2=1.625\times 10^{-3}\) s. The structure is initially at rest and a dynamic force with the following expression

$$\begin{aligned} F(t)=(0.5+3{e}^{-2t}\cos (20t)+\sin (10t)) \times 3500 \;\mathrm { N} \end{aligned}$$
(63)

is enforced at the first floor.

Fig. 7
figure 7

Acceleration data with 2% noise for force reconstruction problem

Herein, the dynamic force is to be inversely reconstructed from the acceleration measurements at the 2nd, 6th and 10th floors. The acceleration data are firstly obtained from numerical simulation at the sampling ratio of 500 Hz within the time duration [0, 2]s and then, the noise of level \(e=2\%\) is added to get the measured data; for a schematic view of the measured data, refer to Fig. 7. Clearly, this force reconstruction problem falls into the general form [40, 41] of Eq. (1): the measured data constitute the right-hand side \({\varvec{b}}\), the force discretized at the same sampling frequency 500 Hz forms the unknown solution \({\varvec{x}}\) and the system matrix \({\mathbf {A}}\) of sizes (\(m=3003,n=1001\)) can be obtained as in the Appendix. Note that it has been suggested from Refs. [40, 41] that for continuous force reconstruction, the matrix \({\mathbf {L}}\) (see Eq. (3)) for regularization would better be the one in Eq. (4) with \(s=1\); such a choice can enhance the continuity of the solution and is therefore the case in this example. Under this circumstance, the regularization (3) is not in the standard form, nevertheless, the standard form can be easily obtained following the manipulations in Eqs. (5)–(7) with the transformed system matrix \(\bar{{\mathbf {A}}}\) (of size \(3003 \times 1000\)). Note that the conditioning of the force reconstruction problem is determined by the singular values of \(\bar{{\mathbf {A}}}\) and therefore, these singular values are depicted in Fig. 8. It is seen that except for the smallest singular value \(\sigma _{1000}=2.75\times 10^{-14}\), other singular values lie in the interval \([\sigma _{999}=5.47\times 10^{-7},\sigma _1=63.34]\). As a result, \(\bar{{\mathbf {A}}}\) is approximately rank-1 deficient and the condition number of \(\bar{{\mathbf {A}}}\) is \(2.35\times 10^{15}\), indicating that force reconstruction herein is an ill-posed problem.

Fig. 8
figure 8

Schematic of the singular values of the system matrix for force reconstruction problem

Fig. 9
figure 9

L-curves of different regularization methods for force reconstruction problem: a Tik, b TSVD, c Land, d CGLS, e IIL (\(p=3\)) and f IIE (\(p=1\))

The six regularization methods in Table 1 are used to solve this force reconstruction problem where the truncation order takes \(p=3\) for the IIL to differ from the Landweber method, while it is set \(p=1\) for the IIE at first. The L-curve rule is adopted to determine the regularization parameters and the details are exhibited in Fig. 9. Then, under the selected regularization parameters at the L-corner, the force can be identified and the identification results are displayed in Fig. 10. To further quantify the performance, the relative error (RE) and the computation cost for the six regularization methods are given in Table 4. The above results and further analysis show that

  • The L-shape for the log–log scale residual norm vs. solution norm curve is quite apparent (see Fig. 9) especially for the Tikhonov regularization, the TSVD, the CGLS and the the IIE. As a result, the selection of the regularization parameter at the L-corner is very straightforward.

  • The Landweber method takes 116.33 s to solve the ill-posed problem and the IIL (\(p=3\)) takes 39.80 s. These two methods turn out to be much more expensive than other four regularization methods in terms of computation cost. This is mainly because the iterations required to arrive at the L-corner exceed 1000 for the two methods, indicating that very slow convergence is invoked by the Landweber method and the IIL for the ill-posed problem. Thus, the Landweber method and the IIL are often not recommended to solve the ill-posed problem, except that proper preconditioning of the system matrix is introduced to dramatically reduce the condition number; this, however, is beyond the scope of this paper.

  • The solution given by the Tikhonov regularization admits the greatest relative error 14.66% so that the reconstructed force by the Tikhonov regularization deviates most from the true force (see Fig. 10). This may be caused by the not-so-proper selection of the candidate regularization parameters. In this way, the Tikhonov regularization seems to be more cumbersome than other regularization methods because efforts should be made to determine a proper set of candidate regularization parameters so as to proceed the L-curve selection rule. While for other regularization methods, the number of iterations or the truncation order of the SVD is naturally the regularization parameter.

  • Except for prohibitive computation cost for the Landweber method and the IIL, the computation cost for other four regularization methods is almost at the same level with the time consumption taken between [3.17 s, 8.67 s]. Among them, the cost for the CGLS and the IIE is less than the cost for the Tikhonov regularization and the TSVD; this shall be benefit from the advantage of iterative regularization methods over the direct regularization methods. It is also seen that the proposed IIE takes less time than the CGLS and the reason may lie in that the convergence by the IIE is faster (in only \(j=11\) iterations) than by the CGLS (in \(k=31\) iterations).

  • Except for the Tikhonov regularization, other five regularization methods have led to satisfactory solution of the force reconstruction problem because the relative errors are all less than 9%, among which the proposed IIE performs best.

  • To see the filter effect of the six regularization methods, the filter factors under different singular values are presented in Fig. 11. All well verify the conditions P1–P4 in Sect. 2.2, indicating that these six methods can indeed pose the regularization effect for this example.

  • To further see the effect of the Taylor truncation order p on the proposed IIE, different ps are selected for the IIE and then, the results are presented in Table 5. It is observed that for different p’s, the number of iterations at the L-corner are the same and the computation cost as well as the relative error is quite near to each other. The performance of the proposed iterative integration method including the IIE is almost independent of the choice of the Taylor truncation order p. The reason for this approximate independence on p is given as follows. The \(\ell ^2\) regularization attempts to filter out the effect of relatively small singular values and to preserve the effect of relatively large singular values. Let the smallest one of the relatively large singular values be \(\sigma _s\) and then, to keep the effect of \(\sigma _s\), there should hold \(1-(g_p(\tau \sigma _s^2))^{2^j} \rightarrow 1\) or \((g_p(\tau \sigma _s^2))^{2^j} \rightarrow 0\) for some proper j by referring to the filter functions in Table 1. Note that \(\sigma _s\) is often very small with respect to \(\sigma _1\), i.e., \(\sigma _s/\sigma _1 \ll 1\), and then, by selecting \(\tau\) in the way of Remark 3, there is \(0 < \tau \sigma _s^2 \ll 1\) and \(g_p(\tau \sigma _s^2) \rightarrow 1\). Under this circumstance, the higher order terms in \(g_p(\tau \sigma _s^2)\) become negligible and there is \(g_p(\tau \sigma _s^2) \approx g_1(\tau \sigma _s^2),p \le 2\); this just explains why different choices of p often lead to similar regularization performance.

  • Following the above point, \(1-g_p(\tau \sigma _s^2) \rightarrow 0\) and then, \((g_p(\tau \sigma _s^2))^{k} \approx 1-k(1-g_p(\tau \sigma _s^2))\) while \((g_p(\tau \sigma _s^2))^{2^j} \approx 1-2^j(1-g_p(\tau \sigma _s^2))\) for some proper kj. This to some extent implies that the IIL admits linear (with respect to k) convergence while the IIE admits exponential (with respect to j) convergence. The proposed IIE is reasonably deduced to converge very fast, e.g., only in 11 iterations for this force reconstruction problem.

Fig. 10
figure 10

Identified dynamic forces by the six regularization methods for force reconstruction problem

Table 4 Results on relative error (RE), selected regularization parameter at the L-corner (Reg-Para) and time consumption (in second) by the six regularization methods for force reconstruction problem
Fig. 11
figure 11

Filter factors of the six regularization methods under different singular values for force reconstruction problem

Table 5 Results on relative error (RE), time consumption (in second) and the number of iterations at L-corner by the IIE with different truncation orders p for force reconstruction problem

4.3 Cauchy problem for steady-state heat conduction

Fig. 12
figure 12

Cauchy problem for steady-state heat conduction on a circular domain \(\Omega\)

Consider the steady-state heat conduction in a 2D circular domain \(\Omega\) (see Fig. 12). The conductivity coefficient for this domain is set to unity for simplicity and there is no heat source inside the domain. In this way, the governing equation for steady-state heat conduction is found to be

$$\begin{aligned} -\Delta u = 0, \mathrm {in}\; \Omega \end{aligned}$$
(64)

where \({\varvec{y}}=[x,y] \in \Omega\) and \(u=u({\varvec{y}})\) denotes the temperature field. Let \(\Gamma\) be a portion of the whole boundary, i.e., \(\Gamma \subset \partial \Omega\) and then, the Cauchy problem is described as: given the Dirichlet/temperature and Neumann/heat-flux data on the known boundary \(\Gamma\) with \({\varvec{n}}\) being the outward normal vector

$$\begin{aligned} u|_{\Gamma } = \varphi , \frac{\partial u}{\partial {\varvec{n}}}|_{\Gamma }=\psi , \end{aligned}$$
(65)

determine the boundary condition (or temperaure) on the unknown boundary \(\Gamma _{un}=\partial \Omega \setminus \Gamma\).

A usual way to solve the Cauchy problem is to use the method of fundamental solutions (MFS) [8]. The basic idea is that, when there is no heat source inside the domain, the temperature inside the domain can be approximately generated by a series of n point heat sources properly locating at \({\varvec{z}}_1,{\varvec{z}}_2,\ldots ,{\varvec{z}}_n\) outside the domain. For a single point heat source at \({\varvec{z}}_j\) with unit strength, the resulting temperature is just the fundamental solution (or Green function)

$$\begin{aligned} G({\varvec{y}},{\varvec{z}}_j)=\frac{1}{2\pi } \ln ||{\varvec{y}}-{\varvec{z}}_j||. \end{aligned}$$
(66)

Assume that the strengthes of the n point heat sources are \(f_1,f_2,\ldots, \;f_n\) or \({\varvec{x}}=[f_1;f_2;\ldots; \; f_n]\) and this can lead to the following temperature solution

$$\begin{aligned} u({\varvec{y}})=\sum _{j=1}^n G({\varvec{y}},{\varvec{z}}_j)f_j. \end{aligned}$$
(67)

To get the eventual temperature, the strengths \({\varvec{x}}\) of the hear sources are to be determined from the temperature and flux data on the known boundary \(\Gamma\). To go further, the known boundary data (65) are collected on m points \({\varvec{y}}_1,{\varvec{y}}_2,\ldots ,{\varvec{y}}_m \in \Gamma\), yielding for \(i=1,2,\ldots ,m\)

$$\begin{aligned} \begin{aligned}&u({\varvec{y}}_i)=\sum _{j=1}^n G({\varvec{y}}_i,{\varvec{z}}_j)f_j=\varphi ({\varvec{y}}_i), \\&\frac{\partial u({\varvec{y}}_i)}{\partial {\varvec{n}}}=\sum _{j=1}^n \frac{\partial G({\varvec{y}}_i,{\varvec{z}}_j)}{\partial {\varvec{n}}}f_j=\psi ({\varvec{y}}_i) \end{aligned} \end{aligned}$$
(68)

and this completely coincides with the linear form in Eq. (1):

$$\begin{aligned} \begin{aligned}&{\mathbf {A}} {\varvec{x}}={\varvec{b}},\\&{\mathbf {A}}=\left( \begin{array}{c} {\mathbf {A}}^{(1)} \\ {\mathbf {A}}^{(2)} \\ \end{array} \right) ,{\mathbf {A}}^{(1)}=\left( G({\varvec{y}}_i,{\varvec{z}}_j)\right) _{m \times n},{\mathbf {A}}^{(2)}=\left( \frac{\partial G({\varvec{y}}_i,{\varvec{z}}_j)}{\partial {\varvec{n}}}\right) _{m \times n}, \\&{\varvec{b}}=\left( \begin{array}{c} {\varvec{b}}^{(1)} \\ {\varvec{b}}^{(2)} \\ \end{array} \right) ,{\varvec{b}}^{(1)}=\left( \varphi ({\varvec{y}}_i) \right) _m,{\varvec{b}}^{(2)}=\left( \psi ({\varvec{y}}_i) \right) _m, \end{aligned} \end{aligned}$$
(69)

where the system matrix \({\mathbf {A}}\) is of size \(2m \times n\).

Herein, the Cauchy problem is specified as follows (refer to Fig. 12): the circular domain \(\Omega\) is of radius \(r=1\), the known boundary \(\Gamma\) is set as the part of the boundary corresponding to the angle \(\theta \in [0,\pi ]\), where \(m=101\) points are uniformly selected on this boundary to collect the temperature and flux data, while \(n=100\) source points are uniformly chosen on a circle outside the domain with radius \(R=2\); by these settings, the system matrix \({\mathbf {A}}\) can be obtained. In this example, the true temperature field is given as

$$\begin{aligned} u(x,y)=x^3-3xy^2+e^{2y}\sin (2x)-e^x\cos (y) \end{aligned}$$
(70)

so that the right-hand side \({\varvec{b}}\) is obtainable with the noise level \(e=1\%\) (see Eq. (57)). To get more sights into this Cauchy problem, the singular values of the matrix \({\mathbf {A}}\) are plotted in Fig. 13. It is seen that many singular values approaching \(1\times 10^{-15}\) cluster at zero and the condition number being \(3.04\times 10^{16}\) is very huge; this means that the Cauchy problem herein is indeed an ill-posed problem. The four regularization methods including the Tikhonov regularization, the TSVD, the CGLS and the proposed IIE are used to solve this ill-posed Cauchy problem, while other two methods—the Landweber method and the IIL, are excluded because they suffer from very slow convergence. For the IIE, the Taylor truncation order is simply fixed to \(p=1\) due to the fact that the performance of the IIE is shown nearly independent on p. Again, the L-curve rule is applied to determine the regularization parameters where the details are depicted in Fig. 14 and then, after selecting the regularization parameters at the L-corners, the regularized solutions can be obtained.

Note that with the regularized solution obtained, the temperature field over the whole domain \(\Omega\) is acquired by resorting to Eq. (67), and thereafter, the temperature on the unknown boundary \(\Gamma _{un}\) corresponding to the angle \(\theta \in [\pi ,2\pi ]\) is reconstructed; to get a schematic impression on the identified temperature on \(\Gamma _{un}\) , refer to Figure 15. As it is observed, all four regularization methods have led to satisfactory reconstruction of the temperature on \(\Gamma _{un}\) and thereof, good solution of the Cauchy problem. To quantify the performance of the four regularization methods, the relative error (RE) between the identified and true temperatures at the 101 uniformly distributed points on \(\Gamma _{un}\) and the time consumption are calculated and listed in Table 6. Though all being satisfactory, the proposed IIE is shown to give most accurate reconstruction of the temperature on \(\Gamma _{un}\) because the RE of the IIE equaling to 3.98% is smallest. As regards the computation cost, the TSVD is found to take least computation time while the Tikhonov regularization takes the most. This is mainly caused by the fact that for this problem, the number of the candidate regularization parameters for the Tikhonov regularization is much greater than the required order \(k=15\) for the TSVD. As with the two iterative regularization methods—the CGLS and the IIE, the consumption time is quite near to each other and the convergence is very fast within \(k=23\) iterations for the CGLS and \(j=19\) iterations for the IIE. Consequently, the effectiveness and efficiency of the proposed IIE for solution of the ill-posed Cauchy problem are verified.

Fig. 13
figure 13

Schematic of the singular values of the system matrix for Cauchy problem

Fig. 14
figure 14

L-curves of different regularization methods for Cauchy problem: a Tik, b TSVD, c CGLS and d IIE (\(p=1\))

Fig. 15
figure 15

Identified temperature on the unknown boundary \(\Gamma _{un}\) by different regularization methods for Cauchy problem

Table 6 Results on relative error (RE), selected regularization parameter at the L-corner (Reg-Para) and time consumption (in second) by different regularization methods for Cauchy problem

5 Conclusions

A novel iterative integration regularization method has been proposed for robust and efficient solution of ill-posed problems in this paper. It is mainly motivated by the fact that inversion of a positive definite matrix can be alternatively represented in an integral form. Two practical variants—the IIL and the IIE—to compute the integral using the respective linear and exponential iterations are developed. Detailed analysis reveals that the proposed method possesses the guaranteed regularization effect, i.e., it can filter out the undesired effect of the relatively small singular values, while keep the effect of the relatively large singular values. To verify the performance of the proposed method, three typical ill-posed problems including the Fredholm integral equation, the force reconstruction and the Cauchy problem are studied along with comparison to some usual direct and iterative regularization methods, and results show that

  • For the proposed iterative integration regularization method, the exponential iteration (the IIE) is shown in Table 4 to admit much faster convergence and more efficient computaiton than the linear iteration (the IIL) and therefore, the IIE is preferred in the proposed iterative integration method.

  • The proposed IIE is found able to effectively and efficiently solve all the three ill-posed problems.

  • Compared to general direct regularization methods including the Tikhonov regularization and the TSVD, the proposed IIE usually turns out to be more efficient by referring to the results in Table 4; this is mainly due to the iterative nature of the proposed IIE.

  • The Landweber method, as an iterative regularization method and a particular case of the IIL, is shown to admit very slow convergence and thereof, prohibitively high computation cost for ill-posed problems. In other words, the proposed IIE is much more efficient than the Landweber method.

  • Focusing on the proposed IIE and the CGLS, both methods are iterative; however, the IIE is stationary while the CGLS is non-stationary whose filter factors depend on the right-hand side. In all the three examples, the relative error and the computation cost for the CGLS and the IIE are at the same level, that is to say, the IIE can perform as well as the CGLS. Nevertheless, the main advantage of the proposed IIE over the CGLS resides at that the IIE has the guaranteed regularization effect, while the CGLS may not [25].