Stochastic Inverse Problems: Models and Metrics

Sabbagh, Harold A.; Kim Murphy, R.; Sabbagh, Elias H.; Zhou, Liming; Wincheski, Russell

doi:10.1007/978-3-030-67956-9_6

Harold A. Sabbagh²⁰,
R. Kim Murphy²⁰,
Elias H. Sabbagh²⁰,
Liming Zhou²¹ &
…
Russell Wincheski²²

Part of the book series: Scientific Computation ((SCIENTCOMP))

320 Accesses

Abstract

Over the past 2 years, we have been developing a theory of uncertainty quantification and propagation that is computationally feasible with large numbers of unknowns. We have applied it to a problem of characterizing the eddy-current response of a shot-peened surface, where the surface is modeled as a one-dimensional random conductivity field with a known covariance function. We are currently extending the model to more general materials characterization problems, such as modeling two-dimensional random anisotropic grain noise in titanium alloys. In this case, we assume the existence of a (two-dimensional) covariance function for the random distribution of Euler angles that define the orientation of each crystallite within the material.

Access provided by Autonomous University of Puebla. Download chapter PDF

Random Matrix Models and Nonparametric Method for Uncertainty Quantification

Probabilistic Methods of Inverse Problem Solution

1 Introducing the Problem

Over the past 2 years, we have been developing a theory of uncertainty quantification and propagation that is computationally feasible with large numbers of unknowns. We have applied it to a problem of characterizing the eddy-current response of a shot-peened surface, where the surface is modeled as a one-dimensional random conductivity field with a known covariance function. We are currently extending the model to more general materials characterization problems, such as modeling two-dimensional random anisotropic grain noise in titanium alloys. In this case, we assume the existence of a (two-dimensional) covariance function for the random distribution of Euler angles that define the orientation of each crystallite within the material.

With this background, we want to develop a theory of stochastic inverse problems for more traditional eddy-current NDE flaw characterization and sizing. Instead of a random material, we assume that the flaw can be characterized as a random process. That this is a reasonable approach is suggested by reference to Fig. 6.1, which shows the typical shape of fatigue-crack growth progression in cold-worked fastener holes. Clearly, the ensemble of cracks cannot be modeled by a simple canonical shape with three parameters, length, width, height, so we will need to invoke a stochastic model for analyzing such cracks.

With such a stochastic model, we can draw parallels between ‘probability of detection’ (POD) and ‘likelihood of inversion’ (LOI). In the former, we are given a flaw, and ask ourselves, ‘Can we detect it, and what are the metrics that measure our success?’ In the latter, we are given data, and ask ourselves, ‘Can we associate a flaw with them, and what are the metrics that measure our success?’

The stochastic model will be described later, but first we will develop some background tools that are currently resident in VIC-3D®, and will be the basis of our stochastic computational model.

2 NLSE: Nonlinear Least-Squares Parameter Estimation

Let

$$\displaystyle \begin{aligned} Z=g(p_1,\dots,p_N,f)\ , {} \end{aligned} $$

(6.1)

where p ₁, …, p _N are the N parameters of interest, and f is a control parameter at which the impedance, Z, is measured. f can be frequency, scan-position, lift-off, etc. It is, of course, known; it is not one of the parameters to be determined. To be explicit during our initial discussion of the theory, we will call f ‘frequency.’

In order to determine p ₁, …, p _N, we measure Z at M frequencies, f ₁, …, f _M, where M > N:

$$\displaystyle \begin{aligned} \begin{array}{rcl} Z_1& =&\displaystyle g(p_1,\dots,p_N,f_1) \\ & \vdots&\displaystyle \\ Z_M& =&\displaystyle g(p_1,\dots,p_N,f_M)\ . {} \end{array} \end{aligned} $$

(6.2)

The right-hand side of (6.2) is computed by applying the volume-integral code to a model of the problem, usually at a discrete number of values of the vector, p, forming a multidimensional interpolation grid.

Because the problem is nonlinear, we use a Gauss-Newton iteration scheme to perform the inversion. First, we decompose (6.2) into its real and imaginary parts, thereby doubling the number of equations (we assume the p ₁, …, p _N are real). Then we use the linear approximation to the resistance, R _i, and reactance, X _i, at the ith frequency:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left[\begin{array}{c}R_1\\ X_1\\ \vdots\\ R_M\\ X_M\end{array}\right] \approx \left[\begin{array}{c}R_1(p_1^{(q)},\dots,p_N^{(q)})\\ \\ X_1(p_1^{(q)},\dots,p_N^{(q)})\\ \vdots\\ R_M(p_1^{(q)},\dots,p_N^{(q)})\\ \\ X_M(p_1^{(q)},\dots,p_N^{(q)})\end{array}\right] + \left[\begin{array}{ccc}{\displaystyle\frac{\partial R_1}{\partial p_1}}&\cdots&{\displaystyle\frac{\partial R_1}{\partial p_N}}\\ &&\\ {\displaystyle\frac{\partial X_1}{\partial p_1}}&\cdots&{\displaystyle\frac{\partial X_1}{\partial p_N}}\\ &\vdots&\\ {\displaystyle\frac{\partial R_M}{\partial p_1}}&\cdots&{\displaystyle\frac{\partial R_M}{\partial p_N}}\\ &&\\ {\displaystyle\frac{\partial X_M}{\partial p_1}}&\cdots&{\displaystyle\frac{\partial X_M}{\partial p_N}}\end{array} \right]_{(p_1^{(q)},\dots,p_N^{(q)})} \hspace{-0.5in}\left[\begin{array}{c}p_1-p_1^{(q)}\\ \vdots\\ p_N-p_N^{(q)}\end{array}\right]\ ,\qquad {} \end{array} \end{aligned} $$

(6.3)

where the superscript (q) denotes the qth iteration, and the partial derivatives are computed numerically by the software. The left side of (6.3) is taken to be the measured values of resistance and reactance. We rewrite (6.3) as

$$\displaystyle \begin{aligned} 0\approx r+Jp\ , {} \end{aligned} $$

(6.4)

where r is the 2M-vector of residuals, J is the 2M × N Jacobian matrix of derivatives, and p is the N-dimensional correction vector. Equation (6.4) is solved in a least-squares manner starting with an initial value, $(x_1^{(0)}, \dots ,x_N^{(0)})$, for the vector of unknowns, and then continuing by replacing the initial vector with the updated vector $(x_1^{(q)},\dots , x_N^{(q)})$ that is obtained from (6.3), until convergence occurs.

We are interested in determining a bound for the sensitivity of the residual norm to changes in some linear combination of the parameters. Given an 𝜖 > 0 and a unit vector, v, the problem is to determine a sensitivity (upper) bound, σ, such that

$$\displaystyle \begin{aligned} \Vert r(x^*+\sigma v)\Vert\le(1+\epsilon)\Vert r(x^*)\Vert\ . {} \end{aligned} $$

(6.5)

We will derive an estimate of σ. Equation (6.5) is equivalent to

$$\displaystyle \begin{aligned} \Vert r(x^*+\sigma v)\Vert-\Vert r(x^*)\Vert \le\epsilon\Vert r(x^*)\Vert\ . {} \end{aligned} $$

(6.6)

The left-hand side of (6.6) can be approximated to the second order in σ by the second-order Taylor expansion:

$$\displaystyle \begin{aligned} \Vert r(x^*+\sigma v)\Vert-\Vert r(x^*)\Vert \approx \sigma v\cdot\nabla\Vert r(x^*)\Vert\ + {\displaystyle\frac{\sigma^2}{2}}\sum_{i,j} {\displaystyle\frac{\partial^2\Vert r(x)\Vert}{\partial x_j \partial x_i}}\vert_{x^*}v_iv_j\ , {} \end{aligned} $$

(6.7)

where ∇ is the gradient operator in N −dimensional space. Even though the gradient vanishes at the minimum point, we will compute it to get the algebra started:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \nabla\Vert r(x)\Vert &=& \nabla\left[f_1^2(x) + f_2^2(x)+\cdots+f_{2M}^2(x)\right]^{1/2} \\ \\ &=&{\displaystyle\frac{1}{\Vert r(x)\Vert}}\left[\begin{array}{c} f_1{\displaystyle\frac{\partial f_1}{\partial x_1}}+\cdots+f_{2M}{\displaystyle\frac{\partial f_{2M}}{\partial x_1}}\\ \vdots\\ f_1{\displaystyle\frac{\partial f_1}{\partial x_N}}+\cdots+f_{2M}{\displaystyle\frac{\partial f_{2M}}{\partial x_N}}\end{array}\right]^T \\ \\ &=&{\displaystyle\frac{r(x)^T}{\Vert r(x)\Vert}}\left[\begin{array}{ccc} {\displaystyle\frac{\partial f_1}{\partial x_1}}&\cdots&{\displaystyle\frac{\partial f_1}{\partial x_N}}\\ &\vdots&\\ {\displaystyle\frac{\partial f_{2M}}{\partial x_1}}&\cdots&{\displaystyle\frac{\partial f_{2M}}{\partial x_N}}\end{array}\right] \\ \\ &=&e^T(x)\cdot J\ , {} \end{array} \end{aligned} $$

(6.8)

where the superscript T denotes the transpose of a matrix (or vector), and e(x) = r(x)∕∥r(x)∥ is a unit vector.

The second derivative that we want is the gradient of (6.8):

$$\displaystyle \begin{aligned} \begin{array}{rcl} \nabla\nabla\Vert r(x)\Vert &=&-{\displaystyle\frac{\nabla\Vert r(x)\Vert}{\Vert r(x)\Vert^2}}\left[\begin{array}{c} f_1{\displaystyle\frac{\partial f_1}{\partial x_1}}+\cdots+f_{2M}{\displaystyle\frac{\partial f_{2M}}{\partial x_1}}\\ \vdots\\ f_1{\displaystyle\frac{\partial f_1}{\partial x_N}}+\cdots+f_{2M}{\displaystyle\frac{\partial f_{2M}}{\partial x_N}}\end{array}\right]^T \\ &+&{\displaystyle\frac{1}{\Vert r(x)\Vert}}\nabla\left[\begin{array}{c} f_1{\displaystyle\frac{\partial f_1}{\partial x_1}}+\cdots+f_{2M}{\displaystyle\frac{\partial f_{2M}}{\partial x_1}}\\ \vdots\\ f_1{\displaystyle\frac{\partial f_1}{\partial x_N}}+\cdots+f_{2M}{\displaystyle\frac{\partial f_{2M}}{\partial x_N}}\end{array}\right]^T\ . {} \end{array} \end{aligned} $$

(6.9)

Before going further, we can immediately drop the first term in (6.9) because the gradient of the norm vanishes at the solution x ^∗. Thus, (6.9) becomes, using index notation,

$$\displaystyle \begin{aligned} \begin{array}{rcl} \nabla\nabla\Vert r(x)\Vert& =&\displaystyle {\displaystyle\frac{1}{\Vert r(x)\Vert}}{\displaystyle\frac{\partial }{\partial x_j}}\left[f_1{\displaystyle\frac{\partial f_1}{\partial x_i}}+\cdots+f_{2M}{\displaystyle\frac{\partial f_{2M}}{\partial x_i}}\right]\ ,\ \ i,j=1,\ldots,N \\ & =&\displaystyle {\displaystyle\frac{1}{\Vert r(x)\Vert}}\sum_{\alpha}\left[{\displaystyle\frac{\partial f_\alpha}{\partial x_j}}{\displaystyle\frac{\partial f_\alpha}{\partial x_i}}+f_\alpha{\displaystyle\frac{\partial^2f_\alpha}{\partial x_j\partial x_i}}\right]\ ,\ \alpha=1,\ldots,2M\ . {} \end{array} \end{aligned} $$

(6.10)

Following [88, page 523], we discard the second-derivative term in (6.10) by arguing that the residual vector for a good model fit should be small, which would make the second derivative term small. Furthermore, it is likely that the residual vector should have terms that are uncorrelated with each other and with the model, thus tending to cancel the second derivative terms when summed over α. We will call (6.10) the first-order curvature tensor, Γ _ij, of the mapping (or deformation) of the parameter space, {x _i}, into the model-measurement space. If we call the ith column of the Jacobian matrix, c _i, then it follows from (6.10) that

$$\displaystyle \begin{aligned} \varGamma_{ij}(x^*)={\displaystyle\frac{c_i(x^*)\cdot c_j(x^*)}{\Vert r(x^*)\Vert}}\ , {} \end{aligned} $$

(6.11)

where we are ignoring the second-derivative term in (6.10).

Digression on Computing Γ _ij(x ^∗)

We can use the MINPACK code that is already in NLSE to compute c _i(x ^∗) ⋅ c _j(x ^∗). The computation of the diagonal elements is already available as the ‘self sensitivities,’ so that leaves the off-diagonal elements. Consider $\Vert c_i(x^*)/\sqrt {2}+c_j(x^*)/\sqrt {2}\Vert ^2={\displaystyle \frac {\Vert c_i(x^*)\Vert ^2}{2}}+c_i(x^*)\cdot c_j(x^*)+{\displaystyle \frac {\Vert c_j(x^*)\Vert ^2}{2}}$. Hence, it follows that $c_i(x^*)\cdot c_j(x^*)=\Vert c_i(x^*)/\sqrt {2}+c_j(x^*)/\sqrt {2}\Vert ^2-{\displaystyle \frac {\Vert c_i(x^*)\Vert ^2+\Vert c_j(x^*)\Vert ^2}{2}}$, where the right-hand side is already calculable using MINPACK in NLSE.

Substituting this result into (6.7) yields an upper bound for the quadratic term:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma^2\sum_{i,j} {\displaystyle\frac{\partial^2\Vert r(x)\Vert}{\partial x_j \partial x_i}}\vert_{x^*}v_iv_j& =&\displaystyle {\displaystyle\frac{\sigma^2}{\Vert r(x^*)\Vert}}\sum_{\alpha}\left[\sum_{i,j}{\displaystyle\frac{\partial f_\alpha}{\partial x_i}}v_i{\displaystyle\frac{\partial f_\alpha}{\partial x_j}}v_j\right]_{x^*} \\ & =&\displaystyle {\displaystyle\frac{\sigma^2}{\Vert r(x^*)\Vert}}(J(x^*)\cdot v)\cdot(J(x^*)\cdot v) \\ & =&\displaystyle {\displaystyle\frac{\sigma^2}{\Vert r(x^*)\Vert}}\Vert J(x^*)\cdot v\Vert^2\ , {} \end{array} \end{aligned} $$

(6.12)

and if we equate this to the right-hand side of (6.6), we get the final result

$$\displaystyle \begin{aligned} \sigma_v=\epsilon^{1/2}\left({\displaystyle\frac{\Vert r(x^*)\Vert}{\Vert J(x^*)\cdot v\Vert}}\right)\ . {} \end{aligned} $$

(6.13)

We will call this the ‘first-order’ approximation, in the sense that we have truncated the Taylor series expansion with the first nonzero term, and have ignored the second-derivative terms in (6.10). This is the expression that is stated, but not derived, in [77].

Note that if ∥J(x ^∗) ⋅ v∥ is small compared to ∥r(x ^∗)∥, then σ is large and the residual norm is insensitive to changes in the linear combination of the parameters specified by v. If v = e _i, the ith column of the N × N identity matrix, then (6.13) produces σ _i, the sensitivity bound for the ith parameter. Since σ _i will vary in size with the magnitude of $x_i^*$, it is better to compare the ratios $\sigma _i/x_i^*$ for i = 1, …, N before drawing conclusions about the fitness of a solution.

The importance of these results is that we now have metrics for the inversion process: Φ = ∥r(x ^∗)∥, the norm of the residual vector at the solution, tells us how good the fit is between the model data and measured data. The smaller this number the better, of course, but the ‘smallness’ depends upon the experimental setup and the accuracy of the model to fit the experiment. Heuristic judgement based on experience will help in determining the quality of the solution for a given Φ.

The sensitivity coefficient, σ, is more subtle, but just as important. It, too, should be small, but, again, the quality of the ‘smallness’ will be determined by heuristics based upon the problem. If σ is large in some sense, it suggests that the solution is relatively independent of that parameter, so that we cannot reasonably accept the value assigned to that parameter as being meaningful, as suggested in Fig. 6.2, which shows a system, S, for which the system is sensitive to variable, x _i, at the solution point, $x_i^*$, and another system, I, for which the system is insensitive to x _i.

An example occurs when one uses a high-frequency excitation, with its attendant small skin depth, to interrogate a deep-seated flaw. The flaw will be relatively invisible to the probe at this frequency, and whatever value is given for its parameters will be highly suspect. When this occurs we will either choose a new parameter to characterize the flaw, or acquire data at a lower frequency.

These metrics are not available to us in the current inspection method, in which analog instruments acquire data that are then interpreted by humans using hardware standards. The opportunity to use these metrics is a significant advantage to the model-based inversion paradigm that we propose in this paper.

3 Confidence Levels: Stochastic Global Optimization

We can extend the previous results to obtain a statistical measure of confidence in the solution. Referring to Fig. 6.2, we have the probability relation

$$\displaystyle \begin{aligned} \mathrm{Prob}[x_i^*-\sigma_vv\le x_i\le x_i^*+\sigma_vv]=\mathrm{Prob}\left[{\displaystyle\frac{\Vert r(x_i)\Vert-\Vert r(x_i^*)\Vert}{\Vert r(x_i^*)\Vert}}\le\epsilon\right]\ . {} \end{aligned} $$

(6.14)

Arguing that $ {\displaystyle \frac {\Vert r(x_i)\Vert -\Vert r(x_i^*)\Vert }{\Vert r(x_i^*)\Vert }}$ is a random variable allows us to transform the inverse methods of [111] into the realm of ‘stochastic inverse problems.’

This approach is based on the current ‘Multi-Level Single Linkage’ algorithm that is used in NLSE to reach the global minimum with probability one [21, 78, 89, 94], and also fits our concept of ‘stochastic inversion.’ Furthermore, it allows us to use prior knowledge of the unknown parameters. Let the model parameters, {x _n}, be a set of independent random variables, each uniformly distributed over its known range of values. We’ll sample the parameter space by choosing, say, 500 points randomly, in accordance with the distribution function of each parameter, and compute the norm of the residual vector at each of the points, as in the first step of NLSE. In NLSE, these points are trial initial points for the minimization algorithm, (6.3), and the lowest of the resulting 500 minima is guaranteed to be the global minimum with unit probability [21, 78, 89, 94].^{Footnote 1}

The random variable, $ {\displaystyle \frac {\Vert r(x_i)\Vert -\Vert r(x_i^*)\Vert }{\Vert r(x_i^*)\Vert }}$, in (6.14) is a continuous function of {x _i} defined on a compact set (the ‘prior feasible set’), so it achieves a finite maximum on that set. This maximum, if it could be determined with probability one, is precisely 𝜖 in (6.14), and when this is substituted into the transfer function, (6.13), we would have determined the confidence level, σ _v, with unit probability. Later we will relax any claims of unit probability in determining 𝜖, but we are permitted to make a strong statement about the confidence level, because in this formulation of a stochastic inverse problem, we are assuming prior statistical constraints of the unknown parameters, {x _n}. This approach is quite ‘Bayesian’, in the sense that we are combining prior information on the random variables with a likelihood estimation (which follows from the least-squares inversion process) to get posterior information on the variables.

Example: A Complex ‘Flaw’

The configuration of the problem is shown in Fig. 6.3. The expansion of the flaw in the (Y, Z) −plane is given by

$$\displaystyle \begin{aligned} f(y,z)=\sum_{i=1}^4\alpha_i\pi^{(1)}_i(y)\pi^{(1)}(z)\ , {} \end{aligned} $$

(6.15)

where π ⁽¹⁾ is a unit pulse function, and the expansion coefficients, $\{\alpha _i\}_{i=1}^4$, determine the magnitude of π ⁽¹⁾(z). These coefficients are the unknown degrees of freedom of the problem, and will be modeled as independent random variables with a uniform distribution over the range [0, 20]. They will be determined by inversion of the data, which are impedances measured by a probe that is scanned over − 100 ≤ Y ≤ 100, X = 0. It should be understood that this formalism fixes the resolution of the flaw in the Y −direction to be 25 mils, as well as the width of the flaw in the X −direction to be 0.1 mil. These numbers are arbitrary, of course, and can be changed to suit the problem. Furthermore, with the four blocks arranged as shown, this configuration will be best suited for modeling and reconstructing midbore, throughwall, and corner bolt-hole cracks.

Figure 6.4 illustrates a complex flaw extending over the entire range in Y . We will use the output of a VIC-3D® model of this flaw to serve as the input data for inversion. To illustrate the inversion process and the importance of the ‘surrogate’ interpolation table for the {α _i}, we will perform a numerical experiment in which the table has successively two, three and four nodes per dimension. In the first case, the nodes are at [0, 20] , in the second, they are at [0, 10, 20], as in Fig. 6.3, and in the third, [0, 7, 14, 21] (in this case, we assume a uniform distribution of the variables over the range [0, 21]). Thus, the first table comprises 2⁴ = 16 nodes, the second 3⁴ = 81 nodes, and the last 4⁴ = 256 nodes. A blending function for each node is computed by VIC-3D®. We quickly see the ‘curse of dimensionality’ occurring. This curse will be obviated through the use of sparse-grid interpolation techniques to reduce the computational burden of building the new table.

The results of the experiment are shown in Table 6.1. The column labeled ’# Points’ lists the number of the original 500 global starting points that are attracted to the global minimum. These results show that increasing the number of nodes per dimension yields improvements in reducing the norm of the residuals, Φ, and the sensitivity coefficients of each variable. Figure 6.5 illustrates the results of Table 6.1, and clearly indicates that increasing the number of nodes beyond 4 will have little effect on the norm of the residuals, r, and only a slight reduction in the various sensitivity coefficients, sensit_i.

Table 6.1 Results for the example problem vs. number of nodes per dimension

Full size table

We ran NLSE four times, effectively sampling the {α _i} space 2000 times, yielding values of ∥r(α)∥_max = 0.2545, 0.2689, 0.2351, and 0.265. The inverted results of each of these runs were identical to those tabulated in Table 6.1, as we expected, since the algorithm in NLSE ensures convergence to the global minimum with probability one. Hence, using the data of the bottom row of Table 6.1 we have

$$\displaystyle \begin{aligned} {\displaystyle\frac{\Vert r(\alpha)\Vert_{\mathrm{max}}-\Vert r(\alpha^*)\Vert}{\Vert r(\alpha^*)\Vert}}={\displaystyle\frac{0.2689-0.00159}{0.00159}}=168.12=\epsilon\ , {} \end{aligned} $$

(6.16)

and when this is substituted into (6.13), along with the sensitivity coefficients tabulated in the bottom row of Table 6.1, we get the parameters of the confidence intervals to be σ ₁ = 1.62, σ ₂ = 2.8, σ ₃ = 2.75, σ ₄ = 2.49. These effectively define the posterior distribution of the {α _i}, which is certainly much different than the prior distribution.

We summarize the results for α _i by claiming that we are ‘certain’ that α _i − σ _i ≤ α _i ≤ α _i + σ _i, with the most likely value being $\alpha _i^*$. In the case where one of the posterior limits on α _i exceeds the prior limit, we reject it in favor of the prior limit, because if the crack actually exceeded the prior limit, the inversion process would have been constrained at the prior limit of the interpolation table. For example, 17.31 ≤ α ₂ ≤ 21, rather than 17.31 ≤ α ₂ ≤ 22.91.

The Chebyshev Inequality

We can improve the calculation of the confidence level, and even make its definition more precise in our example, by resorting to the Chebyshev inequality [63], which states that, if Z is a random variable, then, for every ξ > 0,

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle P[\vert Z\vert\ge\xi ]\le{\displaystyle\frac{\mathrm{VAR}(Z)}{\xi^2}}=\mathrm{MAX\ UNCERTAINTY}(\xi) \\ & &\displaystyle \mathrm{MINIMUM\ CERTAINTY}(\xi)=1-{\displaystyle\frac{\mathrm{VAR}(Z)}{\xi^2}}\ , {} \end{array} \end{aligned} $$

(6.17)

where ξ is the threshold or decision boundary for determining the confidence interval. For example, if we want to be at least 95% confident in our assertion of the probability of the first equality in (6.17), then $1-{\displaystyle \frac {\mathrm {VAR}(Z)}{\xi ^2}}=0.95$, which implies that $\xi =\left ({\displaystyle \frac {\mathrm {VAR}(Z)}{0.05}}\right )^{1/2}$.

To apply this theorem to our problem, we define $Z=\Vert r(\alpha )\Vert _{\mathrm {max}}-\overline {\Vert r(\alpha )\Vert }_{\mathrm {max}}$, where ∥r(α)∥_max is a random variable whose sample value is the output of the following ‘experiment’: run a 500-sample trial, as in the Multi-Level Single Linkage algorithm, and choose the largest result for ∥r(α)∥_max. Repeat the experiment for the second sample, and so on. We have already given an example of this, with the result after four trials that {∥r(α)∥_max} = {0.2545, 0.2689, 0.2351, 0.265}, from which follow $\overline {\Vert r(\alpha )\Vert }_{\mathrm {max}}=0.2559$, VAR(Z) = 0.0001716, and $\xi =\left (0.0001716/0.05\right )^{1/2}=0.0586$ for 95% confidence level.

From the Chebyshev inequality we have, therefore, ∥r(α)∥_max = 0.2559 + 0.0586 = 0.3145. This replaces ∥r(α)∥_max = 0.2689 in (6.16), so that the 95% upper bound is given by

$$\displaystyle \begin{aligned} {\displaystyle\frac{\Vert r(\alpha)\Vert_{\mathrm{max}}-\Vert r(\alpha^*)\Vert}{\Vert r(\alpha^*)\Vert}}={\displaystyle\frac{0.3145-0.00159}{0.00159}}=196.8=\epsilon\ . {} \end{aligned} $$

(6.18)

The new values for the parameters corresponding to the 95% confidence interval are {σ ₁ = 1.75, σ ₂ = 3.03, σ ₃ = 2.97, σ ₄ = 2.69}. The confidence intervals for the four variables are, therefore: α ₁ : [9.44, 12.94], α ₂ : [17.08, 21], α ₃ : [12.59, 18.53], α ₄ : [3.37, 8.75].

Joint Measurement of Conductivity and Magnetic Permeability

We have taken impedance measurements over the frequency range of 100Hz–1 kHz of a ferritic heat-exchanger tube, with the intention of jointly determining the conductivity and relative magnetic permeability of the tube. The interpolation table had the following nodal values: σ : 1.0 × 10⁶, 1.2 × 10⁶, 1.4 × 10⁶, 1.6 × 10⁶, 1.8 × 10⁶; μ : 50, 60, 70, 80, 90. We ran five trials of NLSE with the following results (Table 6.2):

Table 6.2 Results at 100 Hz–1 kHz for conductivity and permeability

Full size table

Following the procedure described above with respect to the Chebyshev inequality, we calculate a value of 𝜖 = 137, which yields σ _cond = 0.272 × 10⁶, and σ _μ = 1.76. Hence, we can say that the most likely value of the conductivity is 1.372 × 10⁶, with a 95% confidence interval of [1.1 × 10⁶, 1.644 × 10⁶]. For the permeability we get even tighter results; the most likely value is 68.18, with a 95% confidence interval of [66.42, 69.94].

The fact that the permeability is well defined at these low frequencies has been validated by use of the Cramer-Rao Lower Bound (CRLB), [111, pp. 407–410], where it is also shown that the optimum frequency for estimating conductivity is 6.0 kHz.

Estimation of Width of a Long, Thin Crack

We are given data at 200 kHz for a crack in a bolt-hole. The data were obtained by a splitD probe with ferrite cores, and the crack was 100 mils long and 18 mils deep. The objective was to determine the width of the crack. The problem is described in greater detail in [111, Section 6.6].

The interpolation table for the width has nodes at 0, 0.125 mils, and 0.25 mils. The inverted results after 5 trials are shown in Table 6.3.

Table 6.3 Inverted results for the width of a crack

Full size table

These results yield a value of 𝜖 = 2.31 and σ _W = 0.045. The most likely value of W is 0.0793 mils, and the 95% confidence interval is [0.0343, 0.1243]. We should note that in these two examples, the confidence interval calculation becomes more precise with an increase in the number of nodes in the interpolation table, as indicated earlier.

4 Summary

We summarize the algorithm and process here.

1.
${\displaystyle \frac {\Vert r(x)\Vert _{\mathrm {max}}-\Vert r(x^*)\Vert }{\Vert r(x^*)\Vert }}=\mathrm {OBJ}(x)$ is a random variable.
2.
∥r(x ^∗)∥ and the Jacobian, J(x ^∗), are determined with prob → 1 (Stochastic Global Optimization via MLSL).
3.
The set {x(𝜖)}∋OBJ(x) ≤ 𝜖 is the ‘posterior feasible set at level 𝜖’.
4.
If OBJ(x) is parabolic (ellipsoidal in N-space), then the set {x(𝜖)} is called the ‘first-order posterior feasible set at level 𝜖’.
5.
$\sigma _v=\epsilon ^{1/2}\left ({\displaystyle \frac {\Vert r(x^*)\Vert }{\Vert J(x^*)\cdot v\Vert }}\right )$ is a mapping from the ‘prior feasible set’ to the ‘first-order posterior feasible set at level 𝜖’.
6.
If we choose 𝜖 to be at the 95% confidence level, as with the Chebyshev Inequality, then the measure of {x ^∗− σ ≤ x ≤ x ^∗ + σ} is at least 95% that of the maximum first-order posterior feasible set, and x ^∗ is the most likely value of x.

Figure 6.6 illustrates the algorithm.

A 2D Example

The results just given are for the situation in which each parameter is tested separately, while the others are fixed at the solution point. Now, we must consider the general case in which the totality of variables are considered jointly. This means operating in four-dimensional space. The tools that we have already set up allow us to do that with no additional expense, except for a minor enhancement to the NLSE code in VIC-3D®. Equation (6.13) is valid for arbitrary orientations of the unit vector, v, and 𝜖 has already been computed using the entire four-dimensional random parameter space in the MLSL stochastic global optimization algorithm.

Consider the 2D example shown in Fig. 6.7, which is the projection onto the (x ₁, x ₂)-plane of the four-dimensional hyperellipsoid associated with the complex flaw example described earlier. Using NLSE, we compute the joint sensitivity associated with the unit vector, v = [0.5, 0.5, 0.5, 0.5] to be 0.129. Then, using 𝜖 = 196.8, as before, we compute σ _{0.5,0.5,0.5,0.5} = 1.81 from (6.13) for the 95%-confidence region for this combination of variables. It should be understood that NLSE already gives us the information to generate the entire N-dimensional hyperellipsoid for a given problem. This would allow us to analytically calculate such things as the volume of the ellipsoid, or cross-sectional areas, etc.

Notes

1.
The Multi-Level Single Linkage method guarantees that the global minimum will be found within a finite number of iterations with probability one, given a sufficiently large sample size of trial points. Numerical experiments with model and laboratory data for a variety of inverse problems over many years [111] suggest that 500 trial points yield a reliable estimate of the global minimum for problems with the number of variables that we are considering.

References

R.H. Byrd, C.L. Dert, A.H.G. Rinnooy Kan, R.B. Schnabel, Concurrent stochastic methods for global optimization. Math. Program. 46, 1–29 (1990)
Article MathSciNet Google Scholar
M. Loève, Probability Theory (D. Van Nostrand, New York, 1955)
MATH Google Scholar
J.J. Moré, B.S. Garbow, K.E. Hillstrom, User Guide for Minpack-1, ANL-80-74, Argonne National Laboratory (1980)
Google Scholar
M. Nakhkash, Y. Huang, M.T.C. Fang, Application of the multilevel single-linkage method to one-dimensional electromagnetic inverse scattering problem. IEEE Trans. Antennas Propag. 47(11), 1658–1668 (1999)
Article ADS Google Scholar
W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C, 2nd edn. (Cambridge University Press, Cambridge, 1992). Reprinted 1997
Google Scholar
A.H.G. Rinnooy Kan, G.T. Timmer, Stochastic global optimization methods part II: Multi level methods. Math. Program. 39, 57–78 (1987)
Article Google Scholar
A.H.G. Rinnooy Kan, G.T. Timmer, Stochastic global optimization methods part i: Clustering methods. Math. Program. 39, 27–56 (1987)
Article Google Scholar
H.A. Sabbagh, R. Kim Murphy, E.H. Sabbagh, J.C. Aldrin, J.S. Knopp, Computational Electromagnetics and Model-Based Inversion: A Modern Paradigm for Eddy-Current Nondestructive Evaluation (Springer, New York, 2013)
Book Google Scholar

Download references

Author information

Authors and Affiliations

Victor Technologies, LLC, Bloomington, IN, USA
Harold A. Sabbagh, R. Kim Murphy & Elias H. Sabbagh
Hermitage, TN, USA
Liming Zhou
NASA Langley Research Center, Hampton, VA, USA
Russell Wincheski

Authors

Harold A. Sabbagh
View author publications
You can also search for this author in PubMed Google Scholar
R. Kim Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Elias H. Sabbagh
View author publications
You can also search for this author in PubMed Google Scholar
Liming Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Russell Wincheski
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sabbagh, H.A., Kim Murphy, R., Sabbagh, E.H., Zhou, L., Wincheski, R. (2021). Stochastic Inverse Problems: Models and Metrics. In: Advanced Electromagnetic Models for Materials Characterization and Nondestructive Evaluation. Scientific Computation. Springer, Cham. https://doi.org/10.1007/978-3-030-67956-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-67956-9_6
Published: 08 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67954-5
Online ISBN: 978-3-030-67956-9
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics

Stochastic Inverse Problems: Models and Metrics

Abstract

Similar content being viewed by others

Random Matrix Models and Nonparametric Method for Uncertainty Quantification

Random Matrix Models and Nonparametric Method for Uncertainty Quantification

Probabilistic Methods of Inverse Problem Solution

1 Introducing the Problem