1 Introduction

As an important detection technique, non-destructive testing and evaluation (NDT&E) has been widely implemented in science and technology industry to evaluate the properties of a material, component or structure, especially in the areas of mechanical engineering, automotive industry, aerospace industry, petro-chemical industry, military industry and so on (Cai et al. 2017; Huang et al. 2016; Islam et al. 2017; Lo et al. 2010). Partially, eddy current pulsed thermography (ECPT), which combines both Pulsed Eddy Current (PEC) technique and thermographic NDT approach, is considered as a promising high-efficiency NDT&E approach for non-contact inspection of surface defects or sub-surface defects in magnetic or non-magnetic conductive materials (Arjun et al. 2015; Gao et al. 2014; Li et al. 2016; Xu et al. 2016). Under the stimulation force of transient magnetic field, the induced pulsed eddy current in the test specimens could continuously transform into the Joule heat near the surface of measured conductor material evenly. However, the surface defects change the density characteristic of induced pulsed eddy current, leading to the significant difference in the Joule heat distribution between the defect regions and the other normal regions. Hence, this thermal pattern information can be captured in terms of infrared image sequence by IR camera in Maldague et al. (2001). In order to deal with these informative raw data in infrared image sequence of ECPT, some available mathematic methods of the data acquisition, fusion and self-adapting have been applied to analyze instability and nonlinearity in ECPT data processing (Cao et al. 2016, 2017a, b; Yin et al. 2017; Yin2 et al. 2017). Moreover, in the past few years, a lot of efforts have been paid to enhance infrared image contrast and to restrain noise interference by extracting and separating the feature information from the infrared image sequences in He et al. (2015), such as Fourier transform-based amplitude and frequency feature extraction (Al-Ayyoube et al. 2017; Wang et al. 2015; Yala et al. 2017), the principal components analysis (PCA) based discriminative defect pattern extraction (Bi et al. 2016; Chen et al. 2016; Li et al. 2015; Zuo et al. 2016) and independent component analysis (ICA) based discriminative defect pattern extraction (Luo et al. 2013; Omar et al. 2010; Xu et al. 2016).

However, the Fourier transform method suffers the deficiency that the useful defect quantification information conceived in transient response is concealed undesirably in Chen et al. (2016). The main problems posed by PCA, meanwhile, are the indefinite physical meaning of each principal component and the absence of the criterion for defect detection, which make it difficult to support a further defect information analysis in Avdelidis et al. (2003). The ICA is a widely used blind source separation algorithm which decomposes a multivariate signal into independent non-Gaussian signals. Moreover, ICA can automatically extract and highlight the abnormal defect feature patterns from infrared image sequences in both the spatial and the time domains (Bousse et al. 2017; Mourad et al. 2017). In ICA, the acquired original information is considered as the interactional result of several statistically independent components (ICs), and the object of ICA is to recover the unknown discriminative feature signals from original mixed data source. Theoretically, the statistical independent ICs can be estimated by the de-mixing matrix, and each independent component (IC) represents certain physical meaning. In practical image processing, both the whitening pre-process of the original data and the iterative computation of the de-mixing matrix are the necessary pre-procedures of ICA. However, since the absence of any prior information, the time-consuming global search in whole data domain is inevitable for the iterative computation of the de-mixing matrix. Therefore, this inefficient data-processing technique based on ICA is beyond the need of the practical applications, so it is necessary to design some new efficient data-processing techniques for ECPT (Cheng et al. 2016; Li et al. 2017; Ruhi et al. 2015).

To proactively satisfy the above mentioned requirement, an adaptive characteristic pick-up algorithm is proposed to extract discriminative defect pattern information and to improve the processing efficiency of thermal image sequence in this paper. The fundamental process of this new algorithm are given as: firstly, the algorithm divides the TTRs contained in thermal image into several parts by the thermal image segmentation technique and finds the low-correlation TTRs by a variable interval search approach. The specific criterion of variable interval is given to calculate the length of the region with largest temperature variation. And the variable interval search is designed to decrease the repetitive computation without losing typical TTRs. Secondly, the correlation distance is calculated to classify the acquired TTRs. Place these TTRs into the cluster whose center point has the smallest correlation distance with themselves. Thirdly, the largest sum of between-class distances is applied to seek the typical TTRs. For one TTR, the sum of correlation distances with other clusters denotes as the sum of between-class distance value. The TTRs with the largest between-class distance are regarded as the typical ones. Finally, the typical TTRs can constitute a matrix to linearly transform the initial image sequence, and then the discriminative features of infrared image sequence can be extracted by the typical TTRs. In contrast with the iterative calculation, the selection of known information would be much more efficient and time-saving, that is why the typical TTRs are utilized to improve the processing velocity.

In this paper, both the theoretical illustration of ICA and the mathematical foundation of the new adaptive algorithm are presented. The ultimate goal of this new adaptive algorithm is to realize the automatic identification of discriminative features as well as the improvement of the detection efficiency. Experimental tests are carried out to demonstrate the advantages of the proposed approach. Based on the similarity analysis between the typical TTRs of the proposed algorithm and the mixing vectors representing the pseudo-inverse of the de-mixing matrix in ICA, the effectiveness of the proposed approach can be verified accordingly. By comparing the processing time of the proposed algorithm and ICA method, the higher image processing efficiency of the proposed algorithm can be confirmed too.

The rest parts of this paper are arranged as follows: Sect. 2 briefly introduces the basic procedure and deficiencies of ICA. Section 3 describes the theoretical considerations and realization of the new algorithm. Section 4 introduces the experimental design and the parameters of two test specimens. Section 5 shows the experimental results. Section 6 presents the summary and prospect.

2 Illustration of ICA in ECPT

In order to have a better comparison between the ICA method and the proposed adaptive algorithm, it is necessary to first introduce the fundamental theories of ICA method in ECPT briefly. In a practical application, because of the increased and decreased eddy current density caused by defects, different stimulated regions of the sample have different temperature variety rates, all these spatial temperature responses of the sample are recorded as an infrared image sequence by an IR camera. These thermal responses can’t be directly identified by infrared sensor but can be considered as the several independent feature regions that have different typical characteristic of thermal response, which will help us to extract the different independent signal image (ISI). On the basis of the above considerations, the goal of ICA is to recover several independent signal images based on independent components (ICs) from the blind source signals of original infrared image sequence, as shown in Fig. 1 (Bai et al. 2013; Cheng et al. 2016). The number of the typical feature regions (or the number of ICs) is artificially set by the researchers or operators according to some personal experience, i.e. there are four typical feature regions defined in Fig. 1.

Fig. 1
figure 1

The typical feature areas of one thermal image

The basic mathematical model of ICA in ECPT can be presented by:

$$\begin{aligned} X^T (t) = {\bar{W}}Y'(t), \end{aligned}$$
(1)

in which \(Y'(t)\) represents the preprocessed initial data. \(X^T(t)\) denotes the ICs. \({\bar{W}}\) is called as the de-mixing matrix. The pseudo-inverse matrix of \({\bar{W}}\) describes the mixing matrix A building with mixing vectors. According to Bai et al. (2013) and Cheng et al. (2016), it is known that the mixing matrix A is similar with the typical features that are denoted as RE, hence, the calculation of the mixing matrix A can be evaluated and simplified by selecting RE from the initial data. Actually, the Eq. (1) can be also described as \(Y = AX^T\), in which YA and X can be further represented by:

$$\begin{aligned} Y= & {} \left[ {\begin{array}{*{20}c} {Y(1,1)} &{} {Y(1,2)} &{} {\cdots } &{} {Y(1,M*N)} \\ {Y(2,1)} &{} {Y(2,2)} &{} {\cdots } &{} {Y(2,M*N)} \\ \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \ddots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} \\ {Y(Z,1)} &{} {Y(Z,2)} &{} {\cdots } &{} {Y(Z,M*N)} \\ \end{array}} \right] \nonumber \\= & {} {\left[ Y{(:,1)},\quad Y{(:,2)},\quad \cdots ,\quad Y(:,M*N)\right] ,}\end{aligned}$$
(2)
$$\begin{aligned} A= & {} \left[ {\begin{array}{*{20}c} {A(1,1)} &{} {A(1,2)} &{} {\cdots } &{} {A(1,L)} \\ {A(2,1)} &{} {A(2,2)} &{} {\cdots } &{} {A(2,L)} \\ \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \ddots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} \\ {A(Z,1)} &{} {A(Z,2)} &{} {\cdots } &{} {A(Z,L)} \\ \end{array}}\right] \nonumber \\= & {} {\left[ A(:,1),\quad A(:,2),\quad \cdots ,\quad A(:,L)\right] ,}\end{aligned}$$
(3)
$$\begin{aligned} X= & {} \left[ {\begin{array}{*{20}c} {X(1,1)} &{} {X(1,2)} &{} {\cdots } &{} {X(1,L)} \\ {X(2,1)} &{} {X(2,2)} &{} {\cdots } &{} {X(2,L)} \\ \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \ddots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} \\ {X(M*N,1)} &{} {X(M*N,2)} &{} {\cdots } &{} {X(M*N,L)} \\ \end{array}}\right] . \end{aligned}$$
(4)

It should be noted that \(Y(i,:)(i=1,2,\ldots ,Z)\) denotes the ith column of the image matrix Y, which represents the infrared image vector spliced by columns in Y, and Z is the number of thermal images at the t axis. Meanwhile, the jth row of Y can be expressed as \(Y(:,j), (j=1,2,\ldots ,M*N)\), and MN respectively represents the number of pixels in vertical and horizontal axis, which is determined by the sensor resolution of the infrared camera. Moreover, since that the location of the testing sample is stationary, Y( : , j) is exactly the thermal response of the jth pixel. L denotes the total number of ICs (i.e. L represents the number of the typical feature regions). Moreover, one can obtain:

$$\begin{aligned} \mathop Y\nolimits ^T =\left[ {\begin{array}{*{20}c} {X(1,1)} &{} {X(1,2)} &{} {\cdots } &{} {X(1,L)} \\ {X(2,1)} &{} {X(2,2)} &{} {\cdots } &{} {X(2,L)} \\ \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} &{} \begin{array}{l} \ddots \\ \end{array} &{} \begin{array}{l} \vdots \\ \end{array} \\ {X(M*N,1)} &{} {X(M*N,2)} &{} {\cdots } &{} {X(M*N,L)} \\ \end{array}} \right] \left[ {\begin{array}{*{20}c} {\mathop A\nolimits ^T (1,:)} \\ {\mathop A\nolimits ^T (2,:)} \\ \begin{array}{l} \vdots \\ \end{array} \\ {\mathop A\nolimits ^T (L,:)} \\ \end{array}} \right] . \end{aligned}$$
(5)

Therefore, the ith thermal response can be expressed by:

$$\begin{aligned} \mathop Y\nolimits ^T (i,:)= & {} X(i,1)\mathop A\nolimits ^T (1,:) + X(i,2)\mathop A\nolimits ^T (2,:) \nonumber \\&+ \cdots + X(i,L)\mathop A\nolimits ^T (L,:). \end{aligned}$$
(6)

Since that A is similar with RE (i.e. \(A\approx {RE}\)), Eq. (6) can be shown as follows:

$$\begin{aligned} \mathop Y\nolimits ^T (i,:)&\approx X(i,1)\mathop {RE}\nolimits ^T (1,:) + X(i,2)\mathop {RE}\nolimits ^T (2,:) \\&\quad + \cdots \nonumber + X(i,L)\mathop {RE}\nolimits ^T (L,:). \end{aligned}$$
(7)

It means that Y can be expressed linearly through the vectors in RE. In other words, the maximal linearly independent subsets of Y are included and the typical features in Y can be reserved as much as possible in the actual testing. The correlation degree for thermal responses in Y can be determined by Pearson Correlation Coefficient (PCC). Actually, the thermal responses with smaller correlation degrees in Y are selected to evaluate RE. Hence, the number of the linearly independent vectors is decided by the number of mixing vectors in A, due to \(A\approx {RE}\).

In general, the mixing vectors in ICA involve the temperature distribution laws of thermal image sequence, and the physical description of ICs is similar with the independent feature regions. However, in order to obtain the ICs, the initial data should be pre-processed by a whitening algorithm (Rao et al. 2005), meanwhile, the de-mixing matrix \({\bar{W}}\) should be iteratively calculated too. Hence, the computational cost of ICA method is usually high, which will slow down the defect detection in real ECPT applications.

Considering that TTRs reveal the spatial temperature distributions in the sample, so the typical TTRs (RE) share a similar physical description for thermal images to the mixing vectors in ICA. It means that only L typical TTRs rather than all TTRs are needed to represent the typical features of one thermal image sequence. Thus, extracting typical TTRs is good enough for the feature extraction of thermal images. With this objective, a new algorithm focusing on typical TTRs selection is proposed to avoid the iterative computation of de-mixing matrix \({\bar{W}}\) in this paper, which is more efficient than global iterative approaches. Meanwhile, without the whitening procedure in ICA, the new algorithm can dramatically reduce the processing time even more.

Remark 2.1

It should be mentioned that the number of typical feature areas in ICA (i.e. L) is manually set. In the following sections, the proposed algorithm will investigate how to automatically choose the number of typical feature areas. It can avoid the negative influence of human intervention may existed in ICA of ECPT.

3 The proposed algorithm in ECPT

The following notations are used in the proposed algorithm: S represents the 3D matrix of the initial thermal image sequence. M denotes the number of rows in S, which indicates the number of pixels in vertical axis. N means the number of columns in S, which indicates the number of pixels in horizontal axis. Z represents the number of thermal images in t axis. L is the number of classes. PCC stands for the Pearson Correlation Coefficient which can be calculated as \(\mathop {PCC}\nolimits _{X,Y} = \frac{{C(X,Y)}}{{\sqrt{var(X)var(Y)} }}\), C(XY) denotes the covariance of vector X and vector Y, and var(·) means the variance.

The proposed algorithm is arranged as follows: Step 1 gives the basic concept. Step 2 to Step 4 realize the thermal image segmentation and the calculation of column and variable row interval. Step 5 shows the process of the variable interval search. Step 6 guarantees the distance correlation cluster analysis by the K-means technique. Step 7 describes the between-class distance based typical TTRs selection approach. Step 8 shows the linear transformation for typical feature extraction with the matrix composed with the typical TTRs selected in Step 7.

Step 1: For each thermal image of the initial thermal image sequence, S(ij,  : ) represents its pixel in the ith row and jth column, where the third index denotes the corresponding time in t axis. Hence, S(ij,  : ) includes every pixel’s transient thermal response of thermal image sequence.

Step 2:\(LP = \mathop {\mathop {\mathop {\max }\nolimits _{\scriptstyle m = 1,\ldots ,M}}\nolimits _{\scriptstyle n = 1,\ldots ,N}}\nolimits _{\scriptstyle z = 1,\ldots ,Z} [S(m,n,z)]\). \(\mathop I\nolimits _{LP}\), \(\mathop J\nolimits _{LP}\), and \(\mathop T\nolimits _{LP}\) denote separately the vertical coordinate, the horizontal coordinate and the t coordinate of LP. To seek the length (i.e. the number of pixels in horizontal axis) of area that contains the largest temperature variation in horizontal axis, the PCCs of \(S(I_{LP} ,J_{LP} ,:)\) and \(S(I_{LP} ,j ,:)\), \((j=1,2,\ldots ,J_{LP}-1,J_{LP}+1,\ldots ,N)\) are computed until their corresponding PCC is less than the threshold \(Ref_{CL}\). That is, \(Ref_{CL}\) is used to seek the length of area with largest temperature variation. The number of vectors \(S(I_{LP} ,j ,:)\), whose PCC with \(S(I_{LP} ,J_{LP} ,:)\) is greater than \(Ref_{CL}\), is recorded as the column interval value CL.

Step 3: Set \(K,(K=1,2,3,\ldots )\) time thresholds \(T(k),(k=1,2,3,\ldots ,K)\) in descending order. The time of peak value of the \(i^{th}\)TTR is recorded as \(t^i_{peak}, (i=1,2,\ldots ,M*N)\). With the comparison between \(T(k),(k=1,2,\ldots ,K)\) and \(t^i_{peak},(i=1,2,\ldots ,M*N)\), the TTRs are divided into \(K+1\) data blocks. The TTR of the \(k^{th},(k=1,2,\ldots ,K+1)\) data block in \(m^{th}\) row and \(n^{th},(n=1,1+CL,1+2*CL,\ldots ,N)\) column is recorded as \(S^k(m,n,:)\).

Step 4: Calculate \(PV^k_{n} = \mathop {\mathop {\max }\nolimits _ {\scriptstyle m = 1, \ldots ,M}}\nolimits _{\scriptstyle z = 1,\ldots ,Z}[S^k(m,n,z)],(k=1,2,\)\(\ldots ,K+1; n=1,1+CL,1+2*CL,\ldots ,N).\)\(I^k_{n}, J^k_{n}\) and \(T^k_{n}\) denote respectively the corresponding vertical coordinate, the corresponding horizontal coordinate and the corresponding t coordinate of \(PV^k_{n}\). Compute the PCC of \(S^k(I^k_{n},J^k_{n},:)\) and \(S^k(i,J^k_{n},:),(i=1,2,\ldots ,M)\) until their PCC is less than the threshold \(REFR^k, (k=1,2,\ldots ,K)\). The number of vectors \(S^k(i,J^k_{n},:),(i=1,2,\ldots ,M)\), in which PCC is greater than \(REFR^k\), is denoted by \(RL^k_n\). \(RL^k_n\) is the row interval value of the \(k^{th}\) data block in the \(n^{th}\) column.

Remark 3.1

From the characteristics of thermal images, CL, \(RL^k_n, (k=1,2,\ldots ,K+1; n=1,1+CL,1+2*CL,\ldots ,N\) are chosen to reduce the repeated calculation of PCC from Step 2 to Step 4. To avoid the lose of significant features, the criterion of the setting interval value is important. For fixed column interval CL, one appropriate method is to seek the length of area with the largest temperature variation. The TTR with largest peak value is often around the area with the largest temperature variation. Hence, in Step 2, the coordinate value of LP (i.e. the largest peak value in S) is applied to seek the length of area with the largest temperature variation. On the other hand, for variable row intervals \(RL^k_n, (k=1,2,\ldots ,K+1; n=1,1+CL,1+2*CL,\ldots ,N)\), the principle of the setting method is similar as CL. The difference is that the PCCs are computed between two TTRs in the same data block and column. The row intervals are set in different data blocks and in different columns. Hence, the algorithm can find all the typical temperature variations which are called as the typical features in the image sequence.

Step 5: Set the threshold value CC. Compute the PCC between two TTRs with the intervals. X( : , 1), that is equal to the TTR with the largest value, is chosen as the starting point for the loop computing function. The specific calculation process is shown as Fig. 2:

(a):

Compute PCC of \(S^k(i,j,:)\) and X( : , z), where X( : , z) denotes the TTR whose PCC with \(X(:,z-1)\) is less than the threshold CC.

(b):

If \(PCC<CC\), \(S^k(i,j,:)\) is considered as a new feature since that the correlation of \(S^k(i,j,:)\) and X( : , z) is low. Then, let \(z=z+1\) and \(X(:,z)= S^k(i,j,:)\), (save the new feature). Otherwise (i.e. \(PCC \ge CC\)), let \(i=\)\(i+ RL^k_n\), where \(RL^k_n\) should be altered if k in \(S^k(i,j,:)\) or the horizontal coordinate n is altered. Furthermore, compute PCC of the next TTR with X( : , z).

(c):

If \(i>M\), let \(i=i-M\). If the row number exceeds the total row number, change to the \(j+CL\) column.

(d):

If \(j>N\), the specific calculation process is finished.

Fig. 2
figure 2

The specific calculation process in Step 5

Remark 3.2

\(Ref_{CL}\) in Step 2 and \(REFR^k\) in Step 4 are threshold which help to find the length of area with the largest temperature variation in one data block. To reserve the important TTRs, \(Ref_{CL}\) and \(REFR^k\) is always chosen to be larger than 0.9. Moreover, T(k) is applied to split the TTRs into several parts. The threshold CC in Step 5 is also defined to be smaller than 0.9. If the PCC of two TTRs is larger than CC, it means that the two TTRs are similar. Only one TTR should be reserved.

Step 6: The TTRs in \(X(:,z),(z=1,2,\ldots ,G)\) are classified through an adaptive cluster algorithm based on K-means. (K-means method is widely used for cluster analysis in data mining, Chan et al. 2016; He et al. 2016). X( : , z) saves the specific value of TTRs. The total number of TTRs in X is denoted as G. \(\mathop {sum}\nolimits ^i\) and \(\mathop {max}\nolimits ^i\) are the sum of inner-class distance and the maximum-between-class distance when \(L=i\), respectively. The clustering number is L, which is determined by the variation speed of \(\mathop {sum}\nolimits ^i\) and \(\mathop {max}\nolimits ^i\). \(\mathop {IP}\nolimits ^i,(i = 1,2, \cdots ,L)\) is \(i^{th}\) initial points. dis(xy) is the correlation distance of vector x and vector y. (That is, \(dis(x,y)=1-PCC\).) \(n(m), (m=1, 2,\ldots , L)\) is the total number of TTRs of cluster m. As shown in Fig. 3, the adaptive cluster selection is implemented as follows:

  • (a) At first, initialize the cluster number \(L=1\) and the cycle index \(k=1\).

  • (b) Define these initial conditions \(\mathop {IP}\nolimits ^1 = X(:,1), \mathop {Cen}\nolimits _k^1= \mathop { IP}\nolimits ^1,\mathop {sum}\nolimits ^1= \sum \nolimits _{z = 1,\ldots ,G} {dis(X(:,z),\mathop {Cen}\nolimits ^1 )}\) and \(\mathop {max}\nolimits ^1=\) 0, in which \(\mathop {Cen}\nolimits ^1 = \frac{1}{G}\sum \nolimits _{z = 1,\ldots ,G} X (:,z)\). Moreover, set \(\mathop {IP}\nolimits ^2=\mathop {\max }\nolimits _{y = 2,\cdots ,G} dis(\mathop {Cen}\nolimits _k^1 ,X(:,y))\) and \(\mathop {Cen}\nolimits _k^2 \mathop { = IP}\nolimits ^2\).

  • (c) According to the K-means algorithm, the following substeps (c1)–(c3) is used to cluster X( : , z):

    • (c1) Let \(n(m)=0,(m=1, 2,\ldots , L)\). If \(\mathop D\nolimits _z^m =\mathop {\min }\nolimits _{i = 1,\cdots ,L}\)\(dis(X(:,z),\mathop {Cen}\nolimits _k^i ),(z=1,2,\ldots ,G)\), it means that X( : , z) belongs to the \(m^{th}\) cluster. Next, let \(n(m)=\)\(n(m)+1\) and \(\mathop {TX}\nolimits ^m (:,n(m))= X(:,z)\).

    • (c2) Moreover, let \(k=k+1\). Calculate \(\mathop {Cen}\nolimits _k^m = \frac{1}{{n(m)}}\)\(\sum \nolimits _{j = 1,\ldots ,n(m)} {\mathop {TX}\nolimits ^m } (:,j)\), \((m=1,2,\ldots ,L)\). If \(\mathop {Cen}\nolimits _k^m =\)\(\mathop {Cen}\nolimits _{k - 1}^m\), stop Step 6; else, go to (c1). (That is, \(X(:,z), (z=1,2,\ldots ,G)\) are clustered.)

    • (c3) Then, record the clustering results: (1) \(\mathop X\nolimits ^m (:,z)\) is defined as the concept that X( : , z) belongs to the \(m^{th}\) cluster, (\(m=1,2,\ldots , L\)); (2) k(m) is defined as the total number of TTRs in the \(m^{th}\) cluster; 3)\(\mathop {Cen}\nolimits ^m\) is defined as the final center point of the \(m^{th}\) cluster, where \(m=1,2,\ldots ,L\).

  • (d) Compute \(\mathop {\max }\nolimits ^L = \mathop {\max }\nolimits _{z = 1,\ldots ,G} \mathop {\max }\nolimits _{j = 1,\ldots ,m - 1,m + 1,\ldots ,L} dis(\mathop X\nolimits ^m (:\)\(,z),\mathop {Cen}\nolimits ^j )\) and \(\mathop {sum}\nolimits ^L = \sum \nolimits _{m = 1,\ldots ,L} {\sum \nolimits _{z = 1,\ldots ,k(m)} {dis(\mathop X\nolimits ^m (:,z),}}\)\(\mathop {Cen}\nolimits ^m )\). Furthermore, record \(\mathop {\max }\nolimits ^L\) and \(\mathop {sum}\nolimits ^L\).

  • (e) If \(L>3\), go to (f); else, go to (g).

  • (f) If \(\displaystyle \left| \frac{{|\mathop {sum}\nolimits ^{i - 1} - \mathop {sum}\nolimits ^i | - |\mathop {\mathop {sum}\nolimits ^i - sum}\nolimits ^{i + 1} |}}{{|\mathop {sum}\nolimits ^i - \mathop {sum}\nolimits ^{i + 1} | - |\mathop {sum}\nolimits ^{i + 1} - \mathop {sum}\nolimits ^{i + 2} |}}\right|> 2, (i=2,\)\(\ldots ,L-2)\), go to (h); else, go to (g).

  • (g) Let \(L=L+1\) and \(\mathop {IP}\nolimits ^L=\mathop {\max }\nolimits _{m = 1,\ldots ,L} \mathop {\max }\nolimits _{y = 2,\ldots ,k(m)} dis(\mathop {Cen}\nolimits ^m ,\)\(\mathop X\nolimits ^m (:,y))\). And initialize \(k=1\). then, go to (c).

  • (h) If \(\mathop {\max }\nolimits ^{i + 1} - \mathop {\max }\nolimits ^i \le 0.05\), the final cluster number of TTRs is equal to \(i-1\), stop clustering; else, \(i=i+1\), go to (k).

  • (k) If \(i<L\), go to (h); else, go to (g).

Fig. 3
figure 3

The adaptive cluster number selection in Step 6

Remark 3.3

The first initial point is X( : , 1) with the largest peak value. Define \(\mathop {IP}\nolimits ^1 = X(:,1)\). The second initial point \(\mathop {IP}\nolimits ^2\) is the TTR with \(\mathop {\max }\nolimits _{y = 2,\ldots ,G} dis(\mathop {IP}\nolimits ^1 ,X(:,y))\). From Step 6, all TTRs can be divided into two clusters. The TTR which has the largest inner-class distance is chosen as the third initial point \(\mathop {IP}\nolimits ^3\). Step 6 is repeated to continuously classify TTRs into three clusters. The TTR that has the largest inner-class distance is the fourth initial point \(\mathop {IP}\nolimits ^4\). More initial points can be found by repeating this procedure. It is apparent that the computational complexity of the method will not be increased. It is more stable than random initial points. All TTRs can be classified suitably. Actually, L is set through the adaptive cluster algorithm based on K-means in Step 6. Unlike ICA in Gao et al. (2014) and Bai et al. (2013) (that is, the number of the typical feature regions (or the number of ICs) is artificially set by the researchers or operators according to some personal experience), the proposed algorithm can automatically find the number of the typical feature areas.

Step 7: Select L final representative transient thermal responses from L clusters.

First, the distance between the \(TTR_j\) in the \(t^{th}\) class and other classes is defined as \(\mathop {MP}\nolimits _{\mathop j\nolimits ^t } = \sum \nolimits _{m = 1,\ldots ,t - 1,t + 1,\ldots ,L} {\mathop {MPCC}\nolimits _{\mathop j\nolimits ^t }^m }\) (i.e. the between-class distance), in which \(\mathop {MPCC}\nolimits _{\mathop j\nolimits ^t }^m ,(j=1, 2,\)\(\ldots , K(t))\) denotes the correlation distance between \(\mathop {Cen}\nolimits ^m\) and \(\mathop {CLU}\nolimits _j^t ,(m=1,2,\ldots ,t-1, t+1,\ldots ,L; t=1,2,\ldots ,L)\). \(\mathop j\nolimits ^t\) represents the \(j^{th}\)TTR in the \(t^{th}\) class. K(t) means the whole number of TTRs in the \(t^{th}\) class. \(\mathop {CLU}\nolimits _j^t\) denotes the \(j^{th}\)TTR of the \(t^{th}\) cluster.

Next, calculate \(\mathop {RE}\nolimits ^t = \mathop {\max }\nolimits _{j = 1,\ldots ,K(t)} \mathop {MP}\nolimits _{\mathop j\nolimits ^t }\) in the \(t^{th}\)\((t=1,2,\)\(\ldots ,L)\) class. Furthermore, define the TTR with \(\mathop {RE}\nolimits ^t\) as the final representation TTR of \(t^{th}\) classification. The TTR with \(\mathop {RE}\nolimits ^t\)\((t=1,2,\ldots ,L)\) is stored into \(Y(:,t),(t=1, 2,\ldots , L)\).

Remark 3.4

The final purpose of the proposed algorithm is to select the typical thermal responses. These typical responses have small correlation value with each other in ECPT. The larger the between-class distance is, the greater difference of the response with others is.

Step 8: Transform the 3D initial image sequence matrix S into a 2D matrix O. The elements in one row of O are taken columnwise from \(S(:,:,p), p=1,2,\ldots ,P\). Then, calculate \({\hat{X}}\) and solve this linear transformation:

$$\begin{aligned} R={\hat{X}} * O, \end{aligned}$$
(8)

in which \({\hat{X}}\) denotes the pseudo-inverse matrix of X, R represents the result of the proposed method. It contains the features of the initial image sequence extracted by the proposed algorithm.

Remark 3.5

In ICA, the number of the main features (L) should be given artificially in Bai et al. (2013) and Gao et al. (2014), which affects the efficiency and accuracy of the feature extraction results. That is, the process of artificial parameter decision in ICA (i.e. choosing the number of the typical feature regions in ICA) is time-consuming, meanwhile, the accuracy and the consistency of this method are hard to be guaranteed too. On the contrary, this problem has been well addressed in the proposed algorithm by clustering transient thermal responses. The proposed method aims at selecting RE, where RE is evaluated by non-correlation degree from the initial thermal responses. That is, L is equal to the proper cluster number when the initial thermal responses have been clustered appropriately. K-means algorithm is used to cluster the transient thermal responses. Record the maximum-inner-cluster-distance, while L (\(L>1\)) has the different values. The number of L is determined by the variation speed of the sum of inner-class distance and the maximum-between-class distance.

Moreover, comparing with ICA in ECPT, the proposed algorithm can reduce processing time since that: (1) the data whitening procedure is omitted in the proposed algorithm (considering that the whitening preprocedure in ICA is time-consuming); (2) the nonlinear calculation formula \(w_p(k+1)=E(Zg(w^T_p(k)Z))-E(g'(w^T_p(k)Z))w_p(k)\) should be performed repeatedly in ICA, where \(g(\cdot )\) is a nonlinear function and \(g'(\cdot )\) is its derivation. While such time-consuming nonlinear calculation is avoided in this proposed algorithm, which help to speed up the searching.

4 Experiment setup

The ECPT utilizes eddy current in materials for defects detection. In our experiments, the eddy current is induced and the surface temperature of sample is recorded in one time. The experimental schematic diagram is displayed in Fig. 4, which consists of five functional units. The induction heater produces high frequency alternating current, which is applied for coil excitation. A rectangular coil is located at the back of the sample, which is utilized to heat the sample by applying directional excitation. The IR camera is used to record the thermal distribution of the sample surface. The heating time is set as 0.1 s for inspection, which is long enough to elicit an available temperature distribution pattern.

Fig. 4
figure 4

The experimental schematic diagram

A steel sample (sample 1) with a slot of 10 mm length, 2 mm width is used in the experiment, which is displayed in Fig. 5. It also shows the sequence of the thermal images of the sample 1 with the slot. The thermal image sequence records the constant surface transient thermal responses of the sample. The image at the end of heating (0.1 s) has been marked for four positions. The four positions correspond to four independent typical thermal response areas, respectively. Figure 6a shows another steel sample with a hole of 3 mm diameter. Figure 6b represents the thermal image of the sample 2, which has two independent typical positions.

Fig. 5
figure 5

The infrared thermal image sequence of the sample 1 with the defect

Fig. 6
figure 6

a The sample 2 with a hole; b one image of the sequence of the sample 2

5 Experimental results

Example 1

The TTRs with different variation trend have been recorded to characterize the discriminative information of the defects. For the sample 1, the time range of the thermal image sequence is 0.53 s. Two time thresholds are set as: \(T(1)=0.03s\) and \(T(2)=0.06s\). Let \(Ref_{CL}=0.95\). The number of the PCC which is greater than \(Ref_{CL}\) is 12. The transient thermal responses are divided into three parts, namely, \(\mathop {REFR}\nolimits ^1=0.94\), \(\mathop {REFR}\nolimits ^2=0.94\), \(\mathop {REFR}\nolimits ^3=0.94\). \(RL^k_n,(k=1,2,3; n=1,13,25,\ldots ,313)\) should be separately equal to the numbers of TTRs, whose temperature variations are similar in one data block and in one column. 13 thermal responses have been selected by Step 3 with \(CC=0.6\).

Next, 13 TTRs should be classified by Step 6. Specially, the following details show the realization of the adaptive cluster number selection. Firstly, let the cluster number \(L=1\) and the first initial center point \(\mathop {IP}\nolimits ^1\) as \(TTR_1\). Moreover, one can derive \(\mathop {sum}\nolimits ^1 = \sum \nolimits _{m = 1} {\sum \nolimits _{z = 1,\ldots ,13} {dis(\mathop X\nolimits ^1 (:,z),\mathop {Cen}\nolimits ^1 )} }=2.1262\) and \(\mathop {\max }\nolimits ^1=0\).

Secondly, set \(L=2\). The second initial center point \(\mathop {IP}\nolimits ^2\) is \(TTR_4\) which has the largest inner-class-distance. Then, let \(\mathop {Cen}\nolimits _1^1 \mathop { = IP}\nolimits ^1\), \(\mathop {Cen}\nolimits _1^2 \mathop { = IP}\nolimits ^2\), compute \(\mathop D\nolimits _z^m=\mathop {\min }\nolimits _{i = 1,2} dis(X(:,z),\mathop {Cen}\nolimits _k^i )\), \(z=1,2,\ldots ,13\). According to \(\mathop D\nolimits _z^m\), put the \(TTR_z\) into the \(m^{th}\) cluster whose center point \(\mathop {Cen}\nolimits _1^m\) has the smallest distance with X( : , z). 13 TTRs can be classified into the 2 clusters. Record \(\mathop {sum}\nolimits ^2 = \sum \nolimits _{m = 1,2} {\sum \nolimits _{z = 1,\ldots ,k(m)} {dis(\mathop X\nolimits ^m (:,z),\mathop {Cen}\nolimits ^m )} }=0.8515\) and \(\mathop {\max }\nolimits ^2=1.0289\).

Thirdly, set \(L=3\). The third initial center point \(\mathop {IP}\nolimits ^3\) is \(TTR_6\) which has the largest inner-class-distance. Then, let \(\mathop {Cen}\nolimits _1^1 \mathop { = IP}\nolimits ^1\), \(\mathop {Cen}\nolimits _1^2 \mathop { = IP}\nolimits ^2\) and \(\mathop {Cen}\nolimits _1^3 \mathop { = IP}\nolimits ^3\). Furthermore, compute \(\mathop D\nolimits _z^m=\mathop {\min }\nolimits _{i = 1,2,3} dis(X(:,z),\mathop {Cen}\nolimits _k^i )\), \(z=1,2,\ldots ,13\). According to the \(\mathop D\nolimits _z^m\), put the \(TTR_z\) into the \(m^{th}\) cluster whose center point \(\mathop {Cen}\nolimits _1^m\) has the smallest distance with X( : , z). 13 TTRs can be classified into the 3 clusters. Record \(\mathop {sum}\nolimits ^3 = \sum \nolimits _{m = 1,2,3} {\sum \nolimits _{z = 1,\ldots ,k(m)} {dis(\mathop X\nolimits ^m (:,z),\mathop {Cen}\nolimits ^m)} }=0.2790\) and \(\mathop {\max }\nolimits ^3 =1.3051\).

Fourthly, set \(L=4\). The fourth initial center point \(IP^4\) is \(TTR_2\) which has the largest inner-class-distance. Then, let \(\mathop {Cen}\nolimits _1^1 \mathop { = IP}\nolimits ^1\), \(\mathop {Cen}\nolimits _1^2 \mathop { = IP}\nolimits ^2\), \(\mathop {Cen}\nolimits _1^3 \mathop { = IP}\nolimits ^3\) and \(\mathop {Cen}\nolimits _1^4 \mathop { = IP}\nolimits ^4\). Compute \(\mathop D\nolimits _z^m=\mathop {\min }\nolimits _{i = 1,2,3,4} dis(X(:,z),\mathop {Cen}\nolimits _k^i )\), \(z=1,2,\ldots ,13\). According to \(\mathop D\nolimits _z^m\), put the \(TTR_z\) into the \(m^{th}\) cluster whose center point \(\mathop {Cen}\nolimits _1^m\) has the smallest distance with X( : , z). 13 TTRs can be classified into the 4 clusters. Record \(\mathop {sum}\nolimits ^4 = \sum \nolimits _{m = 1,2,3,4} {\sum \nolimits _{z = 1,\ldots ,13} {dis(\mathop X\nolimits ^4 (:,z),\mathop {Cen}\nolimits ^4 )} } =0.0970\), \(max^4 =1.3894\). Then, \(\displaystyle \left| \frac{{|\mathop {sum}\nolimits ^1 - \mathop {sum}\nolimits ^2 | - |\mathop {\mathop {sum}\nolimits ^2 - sum}\nolimits ^3 |}}{{|\mathop {sum}\nolimits ^2 - \mathop {sum}\nolimits ^3 | - |\mathop {sum}\nolimits ^3 - \mathop {sum}\nolimits ^4 |}}\right| =1.7982<2\) (i.e. \(i=2\)). Hence, continue to cluster the TTRs. Set \(L=5\) and the fourth initial center point \(IP^5\) is \(TTR_{13}\) that has the largest inner-class-distance. Then, let \(\mathop {Cen}\nolimits _1^1 \mathop { = IP}\nolimits ^1\), \(\mathop {Cen}\nolimits _1^2 \mathop { = IP}\nolimits ^2\), \(\mathop {Cen}\nolimits _1^3\)\(\mathop { = IP}\nolimits ^3\), \(\mathop {Cen}\nolimits _1^4 \mathop { = IP}\nolimits ^4\) and \(\mathop {Cen}\nolimits _1^5 \mathop { = IP}\nolimits ^5\). Moreover, calculate \(\mathop D\nolimits _z^m\)\(=\mathop {\min }\nolimits _{i = 1,2,3,4,5} dis(X(:,z), \mathop {Cen}\nolimits _k^i )\) in which \(z=1,2,\ldots ,13\). According to \(\mathop D\nolimits _z^m\), put the \(TTR_z\) into the \(m^{th}\) cluster whose center point \(\mathop {Cen}\nolimits _1^m\) has the smallest distance with X( : , z). 13 TTRs can be classified into the 5 clusters. Then, record \(\mathop {sum}\nolimits ^5\)\(= \sum \nolimits _{m = 1,2,3,4,5} {\sum \nolimits _{z = 1,\ldots ,13} {dis(\mathop X\nolimits ^m (:,z),\mathop {Cen}\nolimits ^m )} }=0.0749\) and \(max^5 =1.3894\). Then, \(\displaystyle \left| \frac{{|\mathop {sum}\nolimits ^2 - \mathop {sum}\nolimits ^3 | - |\mathop {sum}\nolimits ^3 - \mathop {sum}\nolimits ^4 |}}{{|\mathop {sum}\nolimits ^3 - \mathop {sum}\nolimits ^4 | - |\mathop {sum}\nolimits ^4 - \mathop {sum}\nolimits ^5 |}}\right| =2.4422\)\(>2\) (\(i=3\)). The supplementary condition \(\mathop {\max }\nolimits ^4-\mathop {\max }\nolimits ^3=0.0843\), (\(i=3\)) is not less than 0.05. Then, set \(i=4\), \(\mathop {\max }\nolimits ^5\)-\(\mathop {\max }\nolimits ^4=0<0.01\). Hence, the number of cluster L is selected as 4. The variation trends of \(sum^i\) and \(\mathop {\max }\nolimits ^i\) with different cluster number L are separately shown in Figs. 7 and 8. Therefore, 13 TTRs is divided into 4 clusters. According to the correlation distance, put every TTR into the corresponding cluster whose center point has the smallest correlation distance with it. Finally, one has that \(TTR_1\) is in the first cluster; \(TTR_4,TTR_6\) and \(TTR_{12}\) are in the second cluster; \(TTR_3,TTR_5,TTR_7,TTR_9\) and \(TTR_{13}\) are in the third cluster; \(TTR_2,TTR_8\) and \(TTR_{10}\) are in the fourth cluster. The total procedure of this algorithm is given in Fig.  9.

Fig. 7
figure 7

The sum of inner-class distance with different cluster numbers of 13 TTRs

Fig. 8
figure 8

The maximum-between-class distance with different cluster numbers of 13 TTRs

Fig. 9
figure 9

The total procedure of this algorithm in Step 6

Next, the four typical TTRs with the largest sum of the between-class distances will be selected as the final typical TTRs in the above 4 clusters. The specific process is: record the final center point \(\mathop {Cen}\nolimits ^m\) of each cluster when \(L=4\). In the first cluster, \(TTR_1\) as the only element should be chosen as the typical one of the first cluster. In the second cluster, the between-cluster correlation distances of \(TTR_4\) with other clusters’ center points are \(\mathop {MPCC}\nolimits _{\mathop 1\nolimits ^2 }^1=1.3894\), \(\mathop {MPCC}\nolimits _{\mathop 1\nolimits ^2 }^3=0.4012\) and \(\mathop {MPCC}\nolimits _{\mathop 1\nolimits ^2 }^4=0.9059\), respectively. Then, \(\mathop {MP}\nolimits _{\mathop 1\nolimits ^2 } = \sum \nolimits _{m = 1,3,4} {\mathop {MPCC}\nolimits _{\mathop 1\nolimits ^2 }^m }=2.6965\) (for \(TTR_4\)). The between-cluster correlation distances of \(TTR_6\) are \(\mathop {MPCC}\nolimits _{\mathop 2\nolimits ^2 }^1 = 1.2915\), \(\mathop {MPCC}\nolimits _{\mathop 2\nolimits ^2 }^3\)\(= 0.2561\) and \(\mathop {MPCC}\nolimits _{\mathop 2\nolimits ^2 }^4 = 0.7427\), respectively. Then, \(\mathop {MP}\nolimits _{\mathop 2\nolimits ^2 } = \sum \nolimits _{i = 1,3,4} {\mathop {MPCC}\nolimits _{\mathop 2\nolimits ^2 }^i } =2.2903\) (for \(TTR_6\)). The between-cluster correlation distances of \(TTR_{12}\) are \(\mathop {MPCC}\nolimits _{\mathop 3\nolimits ^2 }^1 = 1.2270\), \(\mathop {MPCC}\nolimits _{\mathop 3\nolimits ^2 }^3\)\(=0.1982\) and \(\mathop {MPCC}\nolimits _{\mathop 3\nolimits ^2 }^4=0.6610\). Hence, one has \(\mathop {MP}\nolimits _{\mathop 3\nolimits ^2 } = \sum \nolimits _{i = 1,3,4} {\mathop {MPCC}\nolimits _{\mathop 3\nolimits ^2 }^i }=2.0862\) (for \(TTR_{12}\)). Thus, one has \(\mathop {RE}\nolimits ^2=\mathop {\max }\nolimits _{j = 1,2,3} \mathop {MP}\nolimits _{\mathop j\nolimits ^2 } =2.6965\). That is, \(TTR_4\) is selected as the typical TTR of the second cluster. \(\mathop {MPCC}\nolimits _{\mathop j\nolimits ^t }^m\) and \(\mathop {MP}\nolimits _{\mathop j\nolimits ^t }\) in the third and fourth clusters are listed in Tables 1 and 2. In the third cluster, \(\mathop {RE}\nolimits ^3 = \mathop {\max }\nolimits _{j = 1,2,3,4,5} \mathop {MP}\nolimits _{\mathop j\nolimits ^3 } =\mathop {MP}\nolimits _{\mathop 2\nolimits ^3 }=1.3486\). Hence, \(TTR_5\) is selected as the typical TTR of the third cluster. In the fourth cluster, \(\mathop {RE}\nolimits ^4 = \mathop {\max }\limits _{j = 1,2,3} \mathop {MP}\nolimits _{\mathop j\nolimits ^4 } =\mathop {MP}\nolimits _{\mathop 2\nolimits ^4 }=1.3103\). \(TTR_8\) is chosen as the typical TTR of the fourth cluster.

Table 1 The sum of between-class correlation distance of center point with each TTR in the third cluster
Table 2 The sum of between-class correlation distance of center point with each TTR in the fourth cluster
Fig. 10
figure 10

a The corresponding image of \(TTR_1\) by using the proposed algorithm; b\(TTR_1\); c the corresponding image of \(TTR_8\) by using the proposed algorithm; d\(TTR_8\); e the corresponding image of \(TTR_5\) by using the proposed algorithm; f\(TTR_5\); g the corresponding image of \(TTR_4\) by using the proposed algorithm; h\(TTR_4\)

Fig. 11
figure 11

a\(ICA_1\); b the corresponding mixing vector 1; c\(ICA_2\); d the corresponding mixing vector 2; e\(ICA_3\); f the corresponding mixing vector 3; g\(ICA_4\); h the corresponding mixing vector 4

Fig. 12
figure 12

a the normalized mixing vector 1 and \(TTR_1\); b the normalized mixing vector 2 and \(TTR_8\); c the normalized mixing vector 3 and \(TTR_5\); d the normalized mixing vector 4 and \(TTR_4\)

The discriminative features can be extracted from \(TTR_1\), \(TTR_8, TTR_5\) and \(TTR_4\) by Step 8. The extraction results of the proposed algorithm and the ICA method are exhibited in Figs. 10, 11 and 12, respectively. The first column of Fig. 10 depicts the four typical areas highlighted by the proposed algorithm. The second column of Fig. 10 shows the features extracted by the proposed method. Figure 11 plots the defect detection under ICA (that is, four ICs separately highlight four typical areas, as shown in Fig. 11). The PPCs between the mixing vector 1, 2, 3, 4 and \(TTR_1, TTR_8, TTR_5\) and \(TTR_4\) are computed as 0.9990, 0.9965, 0.9816 and 0.9917, respectively. Meanwhile, Fig. 12 shows that the features extracted by the proposed method are similar to the mixing vector 1, 2, 3, 4 under ICA. Hence, it is demonstrated that the proposed algorithm can select the typical thermal responses and extracted discriminative features successfully. Moreover, the processing efficiency of the proposed method has increased substantially in comparison with that of ICA. Figure 13 shows the comparisons of processing time between ICA and the proposed method. It is obvious that the proposed algorithm spends less time in completing the feature extraction process.

Figure 13 also shows the relation curves between the processing time and the number of sequence frames, while the comparison of processing efficiency between two different algorithms is introduced accordingly. Figure 13a shows the processing times of ICA, which are separately 1.02, 1.39, 1.83, 2.45 and 3.15 s corresponding to the number of frames as 200, 300, 400, 500 and 600. Obviously, the total processing times of ICA are rapidly increased along with the increase of image frames. Figure 13b shows the processing time of the proposed algorithm. As opposed to ICA, the proposed algorithm just consumes 0.50, 0.63, 0.68, 0.72 and 0.79 s, for the number of the frames as 200, 300, 400, 500, 600, respectively. The ratios of time under the proposed algorithm and ICA are drawn in Fig.  13c, which are 2.04, 2.21, 2.69, 3.40 and 3.99 respectively. Hence, the proposed algorithm is more efficient than ICA, with the increase of data volume.

Fig. 13
figure 13

a The processing time of ICA of the sample 1; b the processing time of the proposed algorithm of the sample 1; c the ratios of time under ICA and the proposed algorithm

Example 2

For the sample 2, the time range of the thermal image sequence is 2s. Two temperature thresholds are set as: \(T(1)=0.75\) s and \(T(2)=1.5\) s. The transient thermal responses are separated into three parts. Let \(\mathop {REFR}\nolimits ^1=0.96\), \(\mathop {REFR}\nolimits ^3=0.96\), \(\mathop {REFR}\nolimits ^3=0.96\) and \(CL=10\). 7 thermal responses are selected by Step 3 in which \(CC=0.7\).

Next, the 7 TTRs should be classified by Step 6. Firstly, set the cluster number \(L=1\). The first initial center point \(\mathop {IP}\nolimits ^1\) is \(TTR_1\). Compute \(\mathop {sum}\nolimits ^1=0.8374\) and \(\mathop {\max }\nolimits ^1=0\). Secondly, set \(L=2\). The second initial center point \(\mathop {IP}\nolimits ^2\) is \(TTR_4\) which has the largest inner-class-distance. Let \(\mathop {Cen}\nolimits _1^1 \mathop { = IP}\nolimits ^1\) and \(\mathop {Cen}\nolimits _1^2 \mathop { = IP}\nolimits ^2\). Compute \(\mathop D\nolimits _z^m\) = \(\mathop {\min }\nolimits _{i = 1,2} dis(X(:,z),\mathop {Cen}\nolimits _k^i )\), \(z=1,2,\ldots ,7\). According to \(\mathop D\nolimits _z^m\), put the \(TTR_z\) into the \(m^{th}\) cluster whose center point \(\mathop {Cen}\nolimits _1^m\) has the smallest distance with X( : , z). 7 TTRs can be classified into the 2 clusters. Record \(\mathop {sum}\nolimits ^2=0.0626\) and \(\mathop {\max }\nolimits ^2=0.4621\). Thirdly, set \(L=3\) and the third initial center point \(\mathop {IP}\nolimits ^3\) is \(TTR_7\) which has the largest inner-class-distance. Then, let \(\mathop {Cen}\nolimits _1^1 \mathop { = IP}\nolimits ^1\), \(\mathop {Cen}\nolimits _1^2 \mathop { = IP}\nolimits ^2\) and \(\mathop {Cen}\nolimits _1^3 \mathop { = IP}\nolimits ^3\). According to \(\mathop D\nolimits _z^m\), 7 TTRs can be classified into the 3 clusters. Record \(\mathop {sum}\nolimits ^3=0.0340\) and \(\mathop {\max }\nolimits ^3\)=0.4751. Fourthly, set \(L=4\). The fourth initial center point \(\mathop {IP}\nolimits ^4\) is \(TTR_2\) which has the largest inner-class-distance. Moreover, let \(\mathop {Cen}\nolimits _1^1 \mathop { = IP}\nolimits ^1\), \(\mathop {Cen}\nolimits _1^2 \mathop { = IP}\nolimits ^2\), \(\mathop {Cen}\nolimits _1^3 \mathop { = IP}\nolimits ^3\) and \(\mathop {Cen}\nolimits _1^4 \mathop { = IP}\nolimits ^4\). According to \(\mathop D\nolimits _z^m\), 7 TTRs can be classified into the 4 clusters. Record \(\mathop {sum}\nolimits ^4=0.0118\). \(\mathop {\max }\nolimits ^4=0.4751\). At this time, \(\displaystyle \left| \frac{{|\mathop {sum}\nolimits ^1 - \mathop {sum}\nolimits ^2 | - |\mathop {\mathop {sum}\nolimits ^2 - sum}\nolimits ^3 |}}{{|\mathop {sum}\nolimits ^2 - \mathop {sum}\nolimits ^3 | - |\mathop {sum}\nolimits ^3 - \mathop {sum}\nolimits ^4 |}}\right| =116.59>2\) (\(i=2\)). The supplementary condition \(\mathop {\max }\nolimits ^3-\mathop {\max }\nolimits ^2=0.013<0.05\). The variation trends of \(sum^i\) and \(\mathop {\max }\nolimits ^i\) with different cluster number L are separately shown in Figs. 14 and 15. Hence, one can obtain L=2. That is, the 7 TTRs should be classified into 2 clusters. Furthermore, one can derive that \(TTR_1, TTR_3, TTR_5\) and \(TTR_7\) are in the first cluster; \(TTR_2,TTR_4\) and \(TTR_6\) are in the second cluster.

Fig. 14
figure 14

The sum of inner-class distance with different cluster numbers of 7 TTRs

Fig. 15
figure 15

The maximum-between-class distance with different cluster numbers of 7 TTRs

\(\mathop {MPCC}\nolimits _{\mathop j\nolimits ^t }^m\) and \(\mathop {MP}\nolimits _{\mathop j\nolimits ^t }\) in the first and second clusters are listed in Tables 3 and 4. From the criterion of the largest between-class-distance, \(\mathop {RE}\nolimits ^1 = \mathop {\max }\nolimits _{j = 1,2,3,4} \mathop {MP}\nolimits _{\mathop j\nolimits ^1 } =\mathop {MP}\nolimits _{\mathop 3\nolimits ^1 }=0.4621\) and \(\mathop {RE}\nolimits ^2\)\(= \mathop {\max }\nolimits _{j = 1,2,3} \mathop {MP}\nolimits _{\mathop j\nolimits ^2 } =\mathop {MP}\nolimits _{\mathop 2\nolimits ^2 }=0.4294\). \(TTR_5\) and \(TTR_4\) are selected separately as the typical TTRs of the first and second clusters.

Table 3 The sum of between-class correlation distance of each center point with each TTR in the first cluster
Table 4 The sum of between-class correlation distance of each center point with each TTR in the second cluster

\(TTR_5\) and \(TTR_4\) are used to extract the discriminative features by Step 6. The first column of Fig. 16 shows the two typical areas highlighted by the proposed algorithm. The extraction results are depicted in the second column of Fig. 16. The result of the proposed algorithm is similar to the result of ICA, which is displayed in Fig. 17. Moreover, the PPC between the mixing vector 1 and \(TTR_5\) is 0.9901, as well as the one between the mixing vector 4 and \(TTR_4\) is 0.9931, as shown in Fig. 18. The experimental results reveal that the proposed algorithm can select the typical thermal response and extract discriminative features successfully.

Fig. 16
figure 16

a The corresponding image of \(TTR_5\) by using the proposed algorithm; b\(TTR_5\); c the corresponding image of \(TTR_4\) by using the proposed algorithm; d\(TTR_4\)

Fig. 17
figure 17

a\(ICA_1\); b the corresponding mixing vector 1; c\(ICA_2\); d the corresponding mixing vector 2

Fig. 18
figure 18

a The normalized mixing vector 1 and \(TTR_5\); b the normalized mixing vector 2 and \(TTR_4\)

Fig. 19
figure 19

a The processing time of ICA of sample 2; b the processing time of the proposed algorithm of sample 2; c the ratios of time under ICA and the proposed algorithm

Figure 19a shows the processing time of ICA, which are separately 1.47, 1.65, 1.91, and 2.20 s for the number of frames as 175, 200, 250 and 305. Meanwhile, the processing times of the proposed algorithm are 0.34, 0.43, 0.47 and 0.54 s, corresponding to the number of frames as 175, 200, 250 and 305, as shown in Fig. 19b. The proposed algorithm performs consistently better with a lower processing time than ICA for the sample 2, as shown in Fig. 19c.

The experimental results about the sample 1 and the sample 2 can be utilized to confirm the high efficiency of proposed method on extracting the discriminative information of the thermal image sequences.

6 Conclusions and future work

In this research, an adaptive feature extraction algorithm is developed for defects identification in eddy current pulsed thermography, which utilizing both the K-means algorithm and automatic segmentation method to realize the thermal image segmentation and variable interval search. The experimental results convinced the validity and efficiency of the proposed method. The main advantages of this new approach can be summarized as follows:

  1. 1.

    Combining the exact physical meaning of feature region in ECPT and the similar mathematical expression in ICA, the proposed algorithm provides good identification capability and detection quality for the discriminative features extracting of the thermal image sequences.

  2. 2.

    Without the whitening pre-process of the original data and the iterative computation of the de-mixing matrix in ICA, the image processing efficiency of the proposed approach can be improved appreciably. Especially when the frames number of the thermal image sequence is increased, the superiority of the new approach is more significant comparing with ICA.

  3. 3.

    The number of typical feature areas (typical TTRs) can be automatically set in the proposed algorithm, which can avoid the negative influence of human intervention and will increase the automation level of the defects identification in ECPT.

In the future, the research effort will focus on how to further enhance the measurement accuracy and processing speed of the proposed method. Considering the effects of threshold value selection on the precision of defects detection in the practical application, how to diminish the negative influence of threshold value will be another important research.