Abstract
Over the past 2 years, we have been developing a theory of uncertainty quantification and propagation that is computationally feasible with large numbers of unknowns. We have applied it to a problem of characterizing the eddy-current response of a shot-peened surface, where the surface is modeled as a one-dimensional random conductivity field with a known covariance function. We are currently extending the model to more general materials characterization problems, such as modeling two-dimensional random anisotropic grain noise in titanium alloys. In this case, we assume the existence of a (two-dimensional) covariance function for the random distribution of Euler angles that define the orientation of each crystallite within the material.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
1 Introducing the Problem
Over the past 2 years, we have been developing a theory of uncertainty quantification and propagation that is computationally feasible with large numbers of unknowns. We have applied it to a problem of characterizing the eddy-current response of a shot-peened surface, where the surface is modeled as a one-dimensional random conductivity field with a known covariance function. We are currently extending the model to more general materials characterization problems, such as modeling two-dimensional random anisotropic grain noise in titanium alloys. In this case, we assume the existence of a (two-dimensional) covariance function for the random distribution of Euler angles that define the orientation of each crystallite within the material.
With this background, we want to develop a theory of stochastic inverse problems for more traditional eddy-current NDE flaw characterization and sizing. Instead of a random material, we assume that the flaw can be characterized as a random process. That this is a reasonable approach is suggested by reference to Fig. 6.1, which shows the typical shape of fatigue-crack growth progression in cold-worked fastener holes. Clearly, the ensemble of cracks cannot be modeled by a simple canonical shape with three parameters, length, width, height, so we will need to invoke a stochastic model for analyzing such cracks.
With such a stochastic model, we can draw parallels between ‘probability of detection’ (POD) and ‘likelihood of inversion’ (LOI). In the former, we are given a flaw, and ask ourselves, ‘Can we detect it, and what are the metrics that measure our success?’ In the latter, we are given data, and ask ourselves, ‘Can we associate a flaw with them, and what are the metrics that measure our success?’
The stochastic model will be described later, but first we will develop some background tools that are currently resident in VIC-3D®, and will be the basis of our stochastic computational model.
2 NLSE: Nonlinear Least-Squares Parameter Estimation
Let
where p 1, …, p N are the N parameters of interest, and f is a control parameter at which the impedance, Z, is measured. f can be frequency, scan-position, lift-off, etc. It is, of course, known; it is not one of the parameters to be determined. To be explicit during our initial discussion of the theory, we will call f ‘frequency.’
In order to determine p 1, …, p N, we measure Z at M frequencies, f 1, …, f M, where M > N:
The right-hand side of (6.2) is computed by applying the volume-integral code to a model of the problem, usually at a discrete number of values of the vector, p, forming a multidimensional interpolation grid.
Because the problem is nonlinear, we use a Gauss-Newton iteration scheme to perform the inversion. First, we decompose (6.2) into its real and imaginary parts, thereby doubling the number of equations (we assume the p 1, …, p N are real). Then we use the linear approximation to the resistance, R i, and reactance, X i, at the ith frequency:
where the superscript (q) denotes the qth iteration, and the partial derivatives are computed numerically by the software. The left side of (6.3) is taken to be the measured values of resistance and reactance. We rewrite (6.3) as
where r is the 2M-vector of residuals, J is the 2M × N Jacobian matrix of derivatives, and p is the N-dimensional correction vector. Equation (6.4) is solved in a least-squares manner starting with an initial value, \((x_1^{(0)}, \dots ,x_N^{(0)})\), for the vector of unknowns, and then continuing by replacing the initial vector with the updated vector \((x_1^{(q)},\dots , x_N^{(q)})\) that is obtained from (6.3), until convergence occurs.
We are interested in determining a bound for the sensitivity of the residual norm to changes in some linear combination of the parameters. Given an 𝜖 > 0 and a unit vector, v, the problem is to determine a sensitivity (upper) bound, σ, such that
We will derive an estimate of σ. Equation (6.5) is equivalent to
The left-hand side of (6.6) can be approximated to the second order in σ by the second-order Taylor expansion:
where ∇ is the gradient operator in N −dimensional space. Even though the gradient vanishes at the minimum point, we will compute it to get the algebra started:
where the superscript T denotes the transpose of a matrix (or vector), and e(x) = r(x)∕∥r(x)∥ is a unit vector.
The second derivative that we want is the gradient of (6.8):
Before going further, we can immediately drop the first term in (6.9) because the gradient of the norm vanishes at the solution x ∗. Thus, (6.9) becomes, using index notation,
Following [88, page 523], we discard the second-derivative term in (6.10) by arguing that the residual vector for a good model fit should be small, which would make the second derivative term small. Furthermore, it is likely that the residual vector should have terms that are uncorrelated with each other and with the model, thus tending to cancel the second derivative terms when summed over α. We will call (6.10) the first-order curvature tensor, Γ ij, of the mapping (or deformation) of the parameter space, {x i}, into the model-measurement space. If we call the ith column of the Jacobian matrix, c i, then it follows from (6.10) that
where we are ignoring the second-derivative term in (6.10).
Digression on Computing Γ ij(x ∗)
We can use the MINPACK code that is already in NLSE to compute c i(x ∗) ⋅ c j(x ∗). The computation of the diagonal elements is already available as the ‘self sensitivities,’ so that leaves the off-diagonal elements. Consider \(\Vert c_i(x^*)/\sqrt {2}+c_j(x^*)/\sqrt {2}\Vert ^2={\displaystyle \frac {\Vert c_i(x^*)\Vert ^2}{2}}+c_i(x^*)\cdot c_j(x^*)+{\displaystyle \frac {\Vert c_j(x^*)\Vert ^2}{2}}\). Hence, it follows that \(c_i(x^*)\cdot c_j(x^*)=\Vert c_i(x^*)/\sqrt {2}+c_j(x^*)/\sqrt {2}\Vert ^2-{\displaystyle \frac {\Vert c_i(x^*)\Vert ^2+\Vert c_j(x^*)\Vert ^2}{2}}\), where the right-hand side is already calculable using MINPACK in NLSE.
Substituting this result into (6.7) yields an upper bound for the quadratic term:
and if we equate this to the right-hand side of (6.6), we get the final result
We will call this the ‘first-order’ approximation, in the sense that we have truncated the Taylor series expansion with the first nonzero term, and have ignored the second-derivative terms in (6.10). This is the expression that is stated, but not derived, in [77].
Note that if ∥J(x ∗) ⋅ v∥ is small compared to ∥r(x ∗)∥, then σ is large and the residual norm is insensitive to changes in the linear combination of the parameters specified by v. If v = e i, the ith column of the N × N identity matrix, then (6.13) produces σ i, the sensitivity bound for the ith parameter. Since σ i will vary in size with the magnitude of \(x_i^*\), it is better to compare the ratios \(\sigma _i/x_i^*\) for i = 1, …, N before drawing conclusions about the fitness of a solution.
The importance of these results is that we now have metrics for the inversion process: Φ = ∥r(x ∗)∥, the norm of the residual vector at the solution, tells us how good the fit is between the model data and measured data. The smaller this number the better, of course, but the ‘smallness’ depends upon the experimental setup and the accuracy of the model to fit the experiment. Heuristic judgement based on experience will help in determining the quality of the solution for a given Φ.
The sensitivity coefficient, σ, is more subtle, but just as important. It, too, should be small, but, again, the quality of the ‘smallness’ will be determined by heuristics based upon the problem. If σ is large in some sense, it suggests that the solution is relatively independent of that parameter, so that we cannot reasonably accept the value assigned to that parameter as being meaningful, as suggested in Fig. 6.2, which shows a system, S, for which the system is sensitive to variable, x i, at the solution point, \(x_i^*\), and another system, I, for which the system is insensitive to x i.
An example occurs when one uses a high-frequency excitation, with its attendant small skin depth, to interrogate a deep-seated flaw. The flaw will be relatively invisible to the probe at this frequency, and whatever value is given for its parameters will be highly suspect. When this occurs we will either choose a new parameter to characterize the flaw, or acquire data at a lower frequency.
These metrics are not available to us in the current inspection method, in which analog instruments acquire data that are then interpreted by humans using hardware standards. The opportunity to use these metrics is a significant advantage to the model-based inversion paradigm that we propose in this paper.
3 Confidence Levels: Stochastic Global Optimization
We can extend the previous results to obtain a statistical measure of confidence in the solution. Referring to Fig. 6.2, we have the probability relation
Arguing that \( {\displaystyle \frac {\Vert r(x_i)\Vert -\Vert r(x_i^*)\Vert }{\Vert r(x_i^*)\Vert }}\) is a random variable allows us to transform the inverse methods of [111] into the realm of ‘stochastic inverse problems.’
This approach is based on the current ‘Multi-Level Single Linkage’ algorithm that is used in NLSE to reach the global minimum with probability one [21, 78, 89, 94], and also fits our concept of ‘stochastic inversion.’ Furthermore, it allows us to use prior knowledge of the unknown parameters. Let the model parameters, {x n}, be a set of independent random variables, each uniformly distributed over its known range of values. We’ll sample the parameter space by choosing, say, 500 points randomly, in accordance with the distribution function of each parameter, and compute the norm of the residual vector at each of the points, as in the first step of NLSE. In NLSE, these points are trial initial points for the minimization algorithm, (6.3), and the lowest of the resulting 500 minima is guaranteed to be the global minimum with unit probability [21, 78, 89, 94].Footnote 1
The random variable, \( {\displaystyle \frac {\Vert r(x_i)\Vert -\Vert r(x_i^*)\Vert }{\Vert r(x_i^*)\Vert }}\), in (6.14) is a continuous function of {x i} defined on a compact set (the ‘prior feasible set’), so it achieves a finite maximum on that set. This maximum, if it could be determined with probability one, is precisely 𝜖 in (6.14), and when this is substituted into the transfer function, (6.13), we would have determined the confidence level, σ v, with unit probability. Later we will relax any claims of unit probability in determining 𝜖, but we are permitted to make a strong statement about the confidence level, because in this formulation of a stochastic inverse problem, we are assuming prior statistical constraints of the unknown parameters, {x n}. This approach is quite ‘Bayesian’, in the sense that we are combining prior information on the random variables with a likelihood estimation (which follows from the least-squares inversion process) to get posterior information on the variables.
Example: A Complex ‘Flaw’
The configuration of the problem is shown in Fig. 6.3. The expansion of the flaw in the (Y, Z) −plane is given by
where π (1) is a unit pulse function, and the expansion coefficients, \(\{\alpha _i\}_{i=1}^4\), determine the magnitude of π (1)(z). These coefficients are the unknown degrees of freedom of the problem, and will be modeled as independent random variables with a uniform distribution over the range [0, 20]. They will be determined by inversion of the data, which are impedances measured by a probe that is scanned over − 100 ≤ Y ≤ 100, X = 0. It should be understood that this formalism fixes the resolution of the flaw in the Y −direction to be 25 mils, as well as the width of the flaw in the X −direction to be 0.1 mil. These numbers are arbitrary, of course, and can be changed to suit the problem. Furthermore, with the four blocks arranged as shown, this configuration will be best suited for modeling and reconstructing midbore, throughwall, and corner bolt-hole cracks.
Figure 6.4 illustrates a complex flaw extending over the entire range in Y . We will use the output of a VIC-3D® model of this flaw to serve as the input data for inversion. To illustrate the inversion process and the importance of the ‘surrogate’ interpolation table for the {α i}, we will perform a numerical experiment in which the table has successively two, three and four nodes per dimension. In the first case, the nodes are at [0, 20] , in the second, they are at [0, 10, 20], as in Fig. 6.3, and in the third, [0, 7, 14, 21] (in this case, we assume a uniform distribution of the variables over the range [0, 21]). Thus, the first table comprises 24 = 16 nodes, the second 34 = 81 nodes, and the last 44 = 256 nodes. A blending function for each node is computed by VIC-3D®. We quickly see the ‘curse of dimensionality’ occurring. This curse will be obviated through the use of sparse-grid interpolation techniques to reduce the computational burden of building the new table.
The results of the experiment are shown in Table 6.1. The column labeled ’# Points’ lists the number of the original 500 global starting points that are attracted to the global minimum. These results show that increasing the number of nodes per dimension yields improvements in reducing the norm of the residuals, Φ, and the sensitivity coefficients of each variable. Figure 6.5 illustrates the results of Table 6.1, and clearly indicates that increasing the number of nodes beyond 4 will have little effect on the norm of the residuals, r, and only a slight reduction in the various sensitivity coefficients, sensiti.
We ran NLSE four times, effectively sampling the {α i} space 2000 times, yielding values of ∥r(α)∥max = 0.2545, 0.2689, 0.2351, and 0.265. The inverted results of each of these runs were identical to those tabulated in Table 6.1, as we expected, since the algorithm in NLSE ensures convergence to the global minimum with probability one. Hence, using the data of the bottom row of Table 6.1 we have
and when this is substituted into (6.13), along with the sensitivity coefficients tabulated in the bottom row of Table 6.1, we get the parameters of the confidence intervals to be σ 1 = 1.62, σ 2 = 2.8, σ 3 = 2.75, σ 4 = 2.49. These effectively define the posterior distribution of the {α i}, which is certainly much different than the prior distribution.
We summarize the results for α i by claiming that we are ‘certain’ that α i − σ i ≤ α i ≤ α i + σ i, with the most likely value being \(\alpha _i^*\). In the case where one of the posterior limits on α i exceeds the prior limit, we reject it in favor of the prior limit, because if the crack actually exceeded the prior limit, the inversion process would have been constrained at the prior limit of the interpolation table. For example, 17.31 ≤ α 2 ≤ 21, rather than 17.31 ≤ α 2 ≤ 22.91.
The Chebyshev Inequality
We can improve the calculation of the confidence level, and even make its definition more precise in our example, by resorting to the Chebyshev inequality [63], which states that, if Z is a random variable, then, for every ξ > 0,
where ξ is the threshold or decision boundary for determining the confidence interval. For example, if we want to be at least 95% confident in our assertion of the probability of the first equality in (6.17), then \(1-{\displaystyle \frac {\mathrm {VAR}(Z)}{\xi ^2}}=0.95\), which implies that \(\xi =\left ({\displaystyle \frac {\mathrm {VAR}(Z)}{0.05}}\right )^{1/2}\).
To apply this theorem to our problem, we define \(Z=\Vert r(\alpha )\Vert _{\mathrm {max}}-\overline {\Vert r(\alpha )\Vert }_{\mathrm {max}}\), where ∥r(α)∥max is a random variable whose sample value is the output of the following ‘experiment’: run a 500-sample trial, as in the Multi-Level Single Linkage algorithm, and choose the largest result for ∥r(α)∥max. Repeat the experiment for the second sample, and so on. We have already given an example of this, with the result after four trials that {∥r(α)∥max} = {0.2545, 0.2689, 0.2351, 0.265}, from which follow \(\overline {\Vert r(\alpha )\Vert }_{\mathrm {max}}=0.2559\), VAR(Z) = 0.0001716, and \(\xi =\left (0.0001716/0.05\right )^{1/2}=0.0586\) for 95% confidence level.
From the Chebyshev inequality we have, therefore, ∥r(α)∥max = 0.2559 + 0.0586 = 0.3145. This replaces ∥r(α)∥max = 0.2689 in (6.16), so that the 95% upper bound is given by
The new values for the parameters corresponding to the 95% confidence interval are {σ 1 = 1.75, σ 2 = 3.03, σ 3 = 2.97, σ 4 = 2.69}. The confidence intervals for the four variables are, therefore: α 1 : [9.44, 12.94], α 2 : [17.08, 21], α 3 : [12.59, 18.53], α 4 : [3.37, 8.75].
Joint Measurement of Conductivity and Magnetic Permeability
We have taken impedance measurements over the frequency range of 100Hz–1 kHz of a ferritic heat-exchanger tube, with the intention of jointly determining the conductivity and relative magnetic permeability of the tube. The interpolation table had the following nodal values: σ : 1.0 × 106, 1.2 × 106, 1.4 × 106, 1.6 × 106, 1.8 × 106; μ : 50, 60, 70, 80, 90. We ran five trials of NLSE with the following results (Table 6.2):
Following the procedure described above with respect to the Chebyshev inequality, we calculate a value of 𝜖 = 137, which yields σ cond = 0.272 × 106, and σ μ = 1.76. Hence, we can say that the most likely value of the conductivity is 1.372 × 106, with a 95% confidence interval of [1.1 × 106, 1.644 × 106]. For the permeability we get even tighter results; the most likely value is 68.18, with a 95% confidence interval of [66.42, 69.94].
The fact that the permeability is well defined at these low frequencies has been validated by use of the Cramer-Rao Lower Bound (CRLB), [111, pp. 407–410], where it is also shown that the optimum frequency for estimating conductivity is 6.0 kHz.
Estimation of Width of a Long, Thin Crack
We are given data at 200 kHz for a crack in a bolt-hole. The data were obtained by a splitD probe with ferrite cores, and the crack was 100 mils long and 18 mils deep. The objective was to determine the width of the crack. The problem is described in greater detail in [111, Section 6.6].
The interpolation table for the width has nodes at 0, 0.125 mils, and 0.25 mils. The inverted results after 5 trials are shown in Table 6.3.
These results yield a value of 𝜖 = 2.31 and σ W = 0.045. The most likely value of W is 0.0793 mils, and the 95% confidence interval is [0.0343, 0.1243]. We should note that in these two examples, the confidence interval calculation becomes more precise with an increase in the number of nodes in the interpolation table, as indicated earlier.
4 Summary
We summarize the algorithm and process here.
-
1.
\({\displaystyle \frac {\Vert r(x)\Vert _{\mathrm {max}}-\Vert r(x^*)\Vert }{\Vert r(x^*)\Vert }}=\mathrm {OBJ}(x)\) is a random variable.
-
2.
∥r(x ∗)∥ and the Jacobian, J(x ∗), are determined with prob → 1 (Stochastic Global Optimization via MLSL).
-
3.
The set {x(𝜖)}∋OBJ(x) ≤ 𝜖 is the ‘posterior feasible set at level 𝜖’.
-
4.
If OBJ(x) is parabolic (ellipsoidal in N-space), then the set {x(𝜖)} is called the ‘first-order posterior feasible set at level 𝜖’.
-
5.
\(\sigma _v=\epsilon ^{1/2}\left ({\displaystyle \frac {\Vert r(x^*)\Vert }{\Vert J(x^*)\cdot v\Vert }}\right )\) is a mapping from the ‘prior feasible set’ to the ‘first-order posterior feasible set at level 𝜖’.
-
6.
If we choose 𝜖 to be at the 95% confidence level, as with the Chebyshev Inequality, then the measure of {x ∗− σ ≤ x ≤ x ∗ + σ} is at least 95% that of the maximum first-order posterior feasible set, and x ∗ is the most likely value of x.
Figure 6.6 illustrates the algorithm.
A 2D Example
The results just given are for the situation in which each parameter is tested separately, while the others are fixed at the solution point. Now, we must consider the general case in which the totality of variables are considered jointly. This means operating in four-dimensional space. The tools that we have already set up allow us to do that with no additional expense, except for a minor enhancement to the NLSE code in VIC-3D®. Equation (6.13) is valid for arbitrary orientations of the unit vector, v, and 𝜖 has already been computed using the entire four-dimensional random parameter space in the MLSL stochastic global optimization algorithm.
Consider the 2D example shown in Fig. 6.7, which is the projection onto the (x 1, x 2)-plane of the four-dimensional hyperellipsoid associated with the complex flaw example described earlier. Using NLSE, we compute the joint sensitivity associated with the unit vector, v = [0.5, 0.5, 0.5, 0.5] to be 0.129. Then, using 𝜖 = 196.8, as before, we compute σ 0.5,0.5,0.5,0.5 = 1.81 from (6.13) for the 95%-confidence region for this combination of variables. It should be understood that NLSE already gives us the information to generate the entire N-dimensional hyperellipsoid for a given problem. This would allow us to analytically calculate such things as the volume of the ellipsoid, or cross-sectional areas, etc.
Notes
- 1.
The Multi-Level Single Linkage method guarantees that the global minimum will be found within a finite number of iterations with probability one, given a sufficiently large sample size of trial points. Numerical experiments with model and laboratory data for a variety of inverse problems over many years [111] suggest that 500 trial points yield a reliable estimate of the global minimum for problems with the number of variables that we are considering.
References
R.H. Byrd, C.L. Dert, A.H.G. Rinnooy Kan, R.B. Schnabel, Concurrent stochastic methods for global optimization. Math. Program. 46, 1–29 (1990)
M. Loève, Probability Theory (D. Van Nostrand, New York, 1955)
J.J. Moré, B.S. Garbow, K.E. Hillstrom, User Guide for Minpack-1, ANL-80-74, Argonne National Laboratory (1980)
M. Nakhkash, Y. Huang, M.T.C. Fang, Application of the multilevel single-linkage method to one-dimensional electromagnetic inverse scattering problem. IEEE Trans. Antennas Propag. 47(11), 1658–1668 (1999)
W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C, 2nd edn. (Cambridge University Press, Cambridge, 1992). Reprinted 1997
A.H.G. Rinnooy Kan, G.T. Timmer, Stochastic global optimization methods part II: Multi level methods. Math. Program. 39, 57–78 (1987)
A.H.G. Rinnooy Kan, G.T. Timmer, Stochastic global optimization methods part i: Clustering methods. Math. Program. 39, 27–56 (1987)
H.A. Sabbagh, R. Kim Murphy, E.H. Sabbagh, J.C. Aldrin, J.S. Knopp, Computational Electromagnetics and Model-Based Inversion: A Modern Paradigm for Eddy-Current Nondestructive Evaluation (Springer, New York, 2013)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Sabbagh, H.A., Kim Murphy, R., Sabbagh, E.H., Zhou, L., Wincheski, R. (2021). Stochastic Inverse Problems: Models and Metrics. In: Advanced Electromagnetic Models for Materials Characterization and Nondestructive Evaluation. Scientific Computation. Springer, Cham. https://doi.org/10.1007/978-3-030-67956-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-67956-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67954-5
Online ISBN: 978-3-030-67956-9
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)