1 Introduction  

Laser beam powder bed fusion (L-PBF) additive manufacturing (AM) process uses a laser beam as energy source to melt and fuse powder particles, layer upon layer, into a pre-designed shape in an enclosed chamber filled with inert gas [1,2,3,4,5]. Volumetric defects generated in the L-PBF process can severely influence the reliability and durability of the L-PBF-manufactured parts, specifically in fatigue-critical applications [6]. X-ray computed tomography (XCT) scanning has been commonly used for non-destructively defect inspection by measuring the absorption differences of penetrable X-rays on the L-PBF parts and reconstructing three-dimensional (3D) models of the parts to display the internal volumetric defects [7,8,9,10]. Due to its efficiency and low cost in inspection, we focus on developing a data-driven framework to augment low-resolution XCT (LR-XCT) to achieve accurate defect inspection and classification for promoting the nondestructive inspection of L-PBF parts and paving the way to understand the impacts of defect on the performance of L-PBF parts [11,12,13].

Due to different fabrication conditions, there are three major types of defects occurring in the L-PBF process: keyholes (KHs), lack of fusions (LoFs), and gas-entrapped pores (GEPs) [13,14,15,16,17,18,19]. They can initiate cracks, leading to impaired mechanical properties and reduced fatigue lives [20,21,22,23]. To identify them, high-resolution XCT (HR-XCT) scanning can provide precise feature values for the defects [9, 16, 24], resulting in a high defect classification accuracy, yet it can be prohibitively costly and/or time-consuming. For instance, in the authors’ previous work [25], defect features extracted from HR-XCT scans (voxel size: 1 μm) led to a high defect classification accuracy (above 98%) at the cost of long scanning time (up to 12 h for each scan). Meanwhile, the availability of HR-XCT scanning for large-sized parts is limited due to its small scanned volume [8, 26]. On the other hand, LR-XCT scanning can be significantly less expensive (e.g., in our case, an LR-XCT scanning takes only 25% or less of the scanning time of an HR-XCT one); however, it typically captures much fewer features. Hence, in this research effort, we consider the question of how accurate defect classification can be performed with LR-XCT.

It is broadly agreed that KHs are large, near-spherical (more precisely, keyhole-looking) defects that occur when the excessive energy input vaporizes the powder near the bottom of the melt pool [13, 17, 27, 28]. LoFs are irregularly shaped, elongated, and crack-like defects that occur when insufficient energy input fails to fully melt the powder between the adjacent scan tracks of the laser beam or layers [13, 17, 18, 29,30,31]. In this case, the occurrence of either KHs or LoFs indicates the inappropriate energy input and corresponding rectifications for the L-PBF process. For instance, KHs can be prevented by reducing the energy input [13, 15, 25, 32], while LoFs can be mitigated or prevented by increasing the energy input [13, 15, 16, 25]. GEPs are the smallest and most spherical defects among all three types [13, 15], caused by a small amount of gas — either presented originally in the powders or generated in the printing — trapped in the parts. GEPs may not be completely prevented even with appropriate energy input [15]; they can be mitigated by reducing the gas entrapment in the powder [33, 34] and optimizing the inert gas flow velocity [35].

Moreover, classifying the defects in the L-PBF parts can also assist in a better understanding of their structural integrity [11, 12, 36]. Among these three types of defects, LoFs were found to be the most detrimental to the L-PBF parts when the sharp edges of those irregularly shaped and large-sized LoFs can induce high stress concentrations in tensile tests or under cyclical loadings [23, 29, 37]. Due to this reason, the L-PBF fabricated Ti-6Al-4 V (Ti64) parts with large-sized and non-spherical LoFs exhibited lower ductility (3–10%) than those with other types of defects (10–15%) [38]. The stainless steel 316 L parts with irregularly-shaped LoFs had an average of 20% lower fatigue limit and 12% lower fatigue life than the ones with spherical and small GEPs [39, 40]. On the contrary, GEPs were found to be the least detrimental ones due to their small sizes and spherical shapes [37]. It is reported that GEPs were harmless to the tensile, fatigue, and hardness of the L-PBF fabricated Ti64 parts when presented in amounts up to 1 volume percentage [31].

A practical approach for defect classification is to utilize the sizes and morphologies of different types of defects obtained by XCT scanning. Some 3D features, such as dimensions, volume, and surface area of the defects, can be directly measured from the XCT scans to quantify the defects’ sizes. Furthermore, these 3D features can be used to derive other features that describe the defects’ morphologies. The morphological features (as a combination of some directly measured and derived features) extracted from the XCT scans can effectively assist in distinguishing different defect types.

We propose a data-driven framework to augment LR-XCT with machine learning (ML) to classify the defects in the L-PBF parts with high efficiency and accuracy based on the morphological features of defects extracted from XCT scans. Our proposed framework incorporates (1) morphological features extraction from XCT scans using computer vision–based feature derivation, (2) morphological features augmentation by regression-based features augmentation models to enhance LR scans, and (3) defect classification through ML-based classifiers. We pose that with appropriately trained augmentation modeling, it may be possible to use the LR-XCT scanning, specifically for larger and more influential defects, to replace the time-consuming HR-XCT scanning without significantly reducing classification accuracy (assuming some HR-XCT scans are also available during training).

The rest of the paper is organized as follows. A review of the literature on defect classification methods, defect inspection by XCT scanning, and applications of ML in AM processes is presented in Sect. 2. Then, the proposed framework for defect classification is presented in Sect. 3, followed by a case study using defects in fabricated L-PBF parts to validate the proposed framework in Sect. 4. Finally, Sect. 5 provides conclusions and a discussion of future work and study limitations.

2 Literature review

The review of relevant literature is organized into three parts: (1) commonly used methods in defect classification, mainly by process map and defect length; (2) advantages and usage of XCT scanning in defect inspection; and (3) a summary of ML applications in AM processes. Based on the reviewed literature, two primary research gaps in developing an efficient and accurate defect classification for the L-PBF process are identified for this study.

2.1 Defect classification by process map and defect length

Process map is a commonly used method to classify different types of defects based on the effect of process parameters in the L-PBF process [14, 15, 41,42,43]. As mentioned above, the generation of KHs and LoFs is significantly impacted by the energy input; meanwhile, the energy input is stated to be controlled by four main process parameters of the L-PBF process. A volumetric energy density \({E}_{V}\)(J/mm3) [13, 43, 44] is defined to qualify the magnitudes of the energy input as a function of these four process parameters:

$${E}_{V}=\frac{p}{v\cdot h\cdot t}$$
(1)

where \(p\) is the laser power (W), \(v\) is the scanning velocity (mm/s), \(h\) is the hatch distancing (mm), and \(t\) is the layer thickness (mm). Excessive energy input leading to KHs can be caused by high laser power, low scanning velocity, small hatch distancing, or small layer thickness. On the contrary, insufficient energy leading to the LoFs can be caused by low laser power, high scanning velocity, large hatch distancing, or large layer thickness. For example, a combination of recommended laser power (280 W) and low scanning velocity (400 mm/s) gave rise to the occurrence of large (max equivalent diameter of 133 μm) KHs in the L-PBF fabricated Ti64 parts [15, 16]. On the other hand, increasing the hatch distancing from 60 to 140 μm led to the occurrence of more irregularly shaped (volume percentage of irregularly shaped defects increasing from 0.005 to 0.015) and larger (max equivalent diameter increasing from 36 to 54 μm) LoFs in the Ti64 parts [15, 16]. Reducing the laser power from 380 to 200W with a constant scanning velocity of 300 mm/s caused the occurrence of more non-spherical LoFs (average sphericity values reducing from 0.61 to 0.56) in the 316L stainless steel parts [45].

Based on the relationship between defect types and process parameters, Zhu et al. [14] developed process maps to fabricate a nearly full-density (relative density above 99%) L-PBF nitinol part by identifying proper combinations of those four process parameters. Gordon et al. [15] generated a process map to identify the boundaries of KHs and LoFs in a laser power-scanning velocity (pv) space of the L-PBF Ti64 parts. Tapia et al. [41] and Meng et al. [42] distinguished keyhole mode and conduction mode regions in the pv space of the L-PBF parts fabricated with stainless steel 316L to identify KHs. However, utilizing the process map approach, all the defects in one L-PBF part are classified by the process parameters instead of being individually inspected, leading to a possible high misclassification rate, especially in identifying the GEPs, which can co-occur with KHs and LoFs.

Other studies [15, 44, 46, 47] classified different types of defects primarily depending on defect length. Gordon et al. [15] stated that the lengths of KHs and LoFs were larger than 40 μm, while the lengths of GEPs were smaller than 20 μm. To further distinguish KHs and LoFs, they stated that KHs were spherical and LoFs were non-spherical. Their findings were based on the L-PBF Ti64 parts fabricated with laser power varying from 100 to 370 W and scanning velocity from 400 to 1500 mm/s. Kasperovich et al. [44] summarized that the lengths of LoFs ranged from 10 to more than 200 μm, and the lengths of KHs were larger than 100 μm in the L-PBF Ti64 parts fabricated with laser power varying from 100 to 200 W and scanning velocity varying from 200 to 1100 mm/s. Zhang et al. [46] observed that both LoFs and KHs had lengths ranging from 10 to more than 100 μm, and GEPs had lengths shorter than 10 μm from the defects in the L-PBF parts fabricated with stainless steel 316L. Snell et al. [13] concluded that the lengths of LoFs are larger than 31 μm, and the lengths of KHs are roughly two times larger than LoFs to distinguish the LoFs and KHs in the L-PBF parts fabricated with Inconel 718. However, the inconsistency of the defect lengths used for defect classification in various studies due to different materials and process parameters might lead to a discrepancy in the results. Therefore, it is beneficial to include more features (e.g., morphological features) of defects for more consistent defect classification. Furthermore, it would be useful to have a technique that does not depend on fixed thresholds for classification.

2.2 Defect inspection by XCT scanning

Using XCT scanning for defect inspection has four advantages over the conventional cross-sectioning (e.g., scanning electron microscopy), such as (1) keeping the L-PBF parts intact for future processes (e.g., heat treatments, shot peening, fatigue testing) [7, 9]; (2) eliminating the part preparation procedures (e.g., grinding, and polishing processes, which may change the morphologies and sizes of defects owing to metal smearing) [24]; (3) examining the entire 3D volume of the L-PBF parts (compared to fractions of the parts with 2D planes); and (4) describing 3D features of the defects (e.g., spatial distribution, volume).

Given the advantages of XCT scanning, many studies [24, 48, 49] used it to inspect the defects in L-PBF parts. Maskery et al. [24] utilized XCT to obtain the morphologies and sizes of defects in the L-PBF AlSi10Mg parts and refined the process parameters to mitigate the defects with large volumes and irregular shapes. du Plessis et al. [49] utilized XCT to investigate the effect of hot isostatic pressing (HIP) on L-PBF parts non-destructively. They observed that the HIP reduced the average volumes and lengths of the defects in the Ti64 parts. In another study to investigate the effects of shot peening (SP), Damon et al. [48] used XCT to obtain the spatial distribution and volumes of the defects in the L-PBF AlSi10Mg parts before and after SP and concluded that most of the near-surface defects were healed with their volumes decreased. Other studies [16, 23, 47, 50] obtained the volumes and positions of all defects in the entire L-PBF parts through XCT scanning and used them as ground truth to verify their respective defect prediction models. Generally, the HR-XCT scanning with a small voxel size (0.65 ~ 2.1 μm) was used to obtain precise morphologies and sizes of the defects [15, 16, 24, 51, 52]. However, the HR-XCT needs a long scanning time and may only allow for a relatively small scanned volume. For instance, in [24], it took approximately 32 h for an HR-XCT to finish scanning a region of 125 mm3. As a result, HR-XCT could be prohibitively costly and inefficient for analyzing defects in many practical applications.

2.3 ML applications for defect analysis

Several pioneering studies have applied ML to model the relationship between defect features and types of defects in the L-PBF parts for defect classification. Snell et al. [13] used k-means clustering models to classify large amounts (over 20,000) of defects into KHs, LoFs, and GEPs based on three 2D morphological features (i.e., length, sphericity, and aspect ratio). However, these three morphological features cannot fully distinguish all the defects with unsupervised learning; roughly half of the defects were unable to be classified into any defect types. Poudel et al. [25] applied an artificial neural network (ANN) model to establish the relationships between the defect types and some 3D morphological features (e.g., elongation, aspect ratio, sphericity) extracted from the HR-XCT scans for defect classification. Their ANN model achieved an overall accuracy above 98% to classify approximately 2000 defects into their types. Cui et al. [53] trained convolutional neural network models to classify defects by 2D image features (e.g., edges, shapes) of the defects directly and achieved an accuracy of 92.3% from 4140 images.

Other studies utilized ML to predict occurrences of defects from the in situ monitoring data. Bartlett et al. [54] trained naïve-Bayes classifiers to identify the nonoptimal energy input–induced KHs or LoFs by the irregular surface topology of each powder layer detected by an optical camera and achieved an average accuracy of 72%. Khanzadeh et al. [50] applied a self-organizing map to distinguish the defects in the L-PBF process by their abnormal melt pool signatures, and their model achieved an accuracy of 63% in predicting the occurrences and positions of defects.

Besides, many studies [55,56,57,58,59,60] used ML to predict porosity (the ratio between the total volume of all defects and the volume of an L-PBF part [61]), primarily depending on the process parameters. For instance, Read et al. [56] used a polynomial regression model to build the relationship between the porosity and process parameters (i.e., laser power, scanning velocity, and hatch distancing) and fabricated low-porosity (0.29%) L-PBF parts with the optimal process parameters found by their model. Tapia et al. [57] built a Gaussian process regression (GPR) model to predict part porosity based on laser power and scanning velocity and achieved a low mean absolute square error (below 20%) between the predicted porosity values and actual observations. Ye et al. [59] conducted an iterative Bayesian optimization established on a GPR model to search for the optimal process parameters in the p–v space and reduced the porosity of L-PBF parts by 0.6%. Liu et al. [60] developed a Gaussian process–based layer-wise porosity modeling to quantify the spatial distribution of porosity in previous layers and predict the positions, sizes, and numbers of the pores in consecutive layers. They achieved an F-score of 0.86 to identify the porosity in 30 consecutive layers based on the 6 previous ones. More ML applications to predict or mitigate the porosity of L-PBF parts were reviewed in [62].

2.4 Research gaps

Based on the reviewed literature, two primary research gaps in the defect classification of the L-PBF process can be identified. First, commonly used HR-XCT scanning is prohibitively costly and inefficient for many practical applications. Second, only a few studies take full advantage of the characteristics of defects (i.e., distinct morphologies and sizes of different types of defects) for defect classification. We aim to bridge these research gaps by proposing a data-driven framework that utilizes the morphologies and sizes of defects obtained by the XCT scanning to develop a general defect classification framework for the L-PBF process. The proposed framework uses time-efficient LR-XCT and leverages ML to augment morphological features to achieve defect classification with improved accuracy and efficiency.

3 Proposed methodology

The overall structure of the proposed framework is depicted in Fig. 1. It consists of three key elements: morphological features extraction from XCT scans, morphological features augmentation, and ML-driven defect classification, and comprises the following four steps.

Fig. 1
figure 1

The overall structure of the proposed framework consists of morphological features extraction, morphological features augmentation, and ML models to classify the types (i.e., KH, LoF, and GEP) of the defects in the L-PBF parts from XCT scans

Step 1 (Sect. 3.1): Morphological features describing the morphologies and sizes of the defects are extracted and derived from the HR and LR-XCT scans of the same L-PBF parts.

Step 2 (Sect. 3.2.1): An algorithmic defect matching model is developed to correlate the HR-XCT and LR-XCT morphological features of the same defects in HR and LR-XCT scans, enabling feature augmentation in Step 3.

Step 3 (Sect. 3.2.2): Regression-based features augmentation models are built to improve the LR-XCT morphological features base on corresponding HR-XCT morphological features.

Step 4 (Sect. 3.3): ML models are employed to classify the defects into their types (i.e., KH, LoF, and GEP) using the augmented LR-XCT morphological features obtained in Step 3.

3.1 Morphological features extraction

Figure 2 depicts typical KHs, LoFs, and GEPs observed in our experiments (see design details in Sect. 4). As discussed in Sect. 1, the three defect types exhibit distinct morphologies and sizes. KHs and GEPs are relatively spherical and regular, while KHs have larger sizes than GEPs; LoFs are elongated and irregularly shaped.

Fig. 2
figure 2

Several typical GEPs, LoFs, and KHs show distinct morphology and size of each type of defects

To distinguish different types of defects, we employed nine morphological features: solidity, sparseness, extent, sphericity, roundness, aspect ratio, elongation, flatness, and major axis [13, 24, 63,64,65]. Definitions of the features are given in Fig. 3. Here, the convex hull refers to the smallest convex polyhedron that contains the defect; the fit ellipsoid is the ellipsoid with the same normalized second central moments as the defect, and the bounding box is the smallest right rectangular prism that fully contains the defect. Solidity, sparseness, and extent measure the irregularity of the defect by comparing the volumes of the convex hull, bounding box, and fit ellipsoid to the volume of the defect, respectively [63, 65, 66]. The aspect ratio, elongation, and flatness measure the differences between two out of three axes of the fit ellipsoid around a defect [13, 24, 63,64,65]. The roundness and sphericity measure how closely a defect resembles a perfect sphere [13, 65, 67]: the former uses the ratio of equivalent diameter to the major axis of the defect, while the latter uses the ratio of volume to the surface area. Lastly, the major axis, quantifying the length of a defect, is also included in the morphological features to distinguish the sizes of different defects.

Fig. 3
figure 3

Illustrations of the eight morphological features derived from the directly measured features (i.e., volume, surface area, convex hull volume, bounding box volume, major, median, and minor axis of the fit ellipsoid)

Based on characteristics and observations in the literature [13, 16,17,18], we surmise that these morphological features can effectively distinguish different types of defects in the L-PBF. First, the irregularly shaped LoFs are expected to have lower solidity, sparseness, and extent values than KHs and GEPs. Second, the elongated LoFs have more significant differences among their three axes and thus are expected to have lower aspect ratio, elongation, and flatness values. Moreover, KHs and GEPs are expected to have higher roundness and sphericity. Lastly, GEPs and KHs can be distinguished by their major axis lengths due to their differences in size. These total of nine selected morphological features are extracted from both the HR- and LR-XCT scans.

3.2 Morphological feature augmentation

In this section, we describe the proposed approach to augmenting the LR-XCT morphological features to achieve a higher accuracy of defect classification by using the discovered relationship between the HR-XCT and LR-XCT morphological features. We first develop an algorithmic defect matching model to pair the defects in LR-XCT and HR-XCT scans and then use the regression-based features augmentation models to find the relationship between them.

3.2.1 Algorithmic defect matching model

To find the relationship between defect features on LR-XCT and HR-XCT scans, it is necessary to develop an automatic way to match the defects from the respective scans. Figure 4 a and b provide a snapshot of HR and LR scans of the same scanned area, with 3 defects manually matched by an expert based on their position. Note that due to the limited accuracy of LR scans, mismatch in the scanned volumes, and the presence of noise, this is a non-trivial task.

Fig. 4
figure 4

Defects in a HR- and b LR-XCT scans of the same scanned area with a size of 3.14 mm3. It is observed that morphologies, sizes, and numbers of the defects are changed largely with the reduced resolutions of XCT scanning. For instance, defects 1, 2, and 3 show different morphologies and sizes in HR- and LR-XCT scans

The proposed matching algorithm is based on the following assumptions: (1) the positions of the same defects, as measured by the coordinates of the defect centroids, are similar in the HR and LR-XCT scans, and (2) the volumes of the same defects in the HR and LR-XCT scans are similar. Note that both assumptions are not necessarily true since the LR-XCT scanning, in particular, can distort the shape of the defect (and hence the position of the centroid and volume), as seen in Fig. 4.

The following process is then used for matching. For each large defect (major axis length \(\ge\) 20 μm) in the LR-XCT scans used as the target defects (TDs), all the large defects in the HR-XCT scans are considered to be the candidates for matching. The match is then selected as the HR defect that minimizes the following expression, which gives the weighted sum of distance and volume ratio between TD and a candidate:

$$Min \left[\lambda \left(\frac{d-{d}_{min}}{{d}_{max}-{d}_{min}}\right)+(1-\lambda )(\frac{vr-{vr}_{min}}{{vr}_{max}-{vr}_{min}})\right]$$
(2)

where \(d\) is the distance between the evaluated candidate and the TD, \({d}_{min}\) and \({d}_{max}\) are the closest and farthest distance between all the defects to be matched in HR- and LR-XCT scans; vr is the LR/HR defect volume ratio, \({vr}_{min}\) and \({vr}_{max}\) are the smallest and largest volume ratio between all the defects to be matched in HR- and LR-XCT scans; \(\lambda\) is the weighting parameter that can be selected to prioritize either the size similarity or position proximity.

3.2.2 Regression-based features augmentation models

The features augmentation models using both linear and non-linear regression algorithms are trained to find relationships between the LR-XCT and HR-XCT morphological features. We denote the values of nine LR-XCT morphological features of a defect by \({X}_{i}\), \(i=\mathrm{1,2},\dots , 9\). Correspondingly, the HR-XCT morphological feature values of the defect are denoted by \({Y}_{j}\), \(j=\mathrm{1,2},\dots , 9\). The relationships found by the features augmentation models are then used to augment the LR-XCT morphological features of a new defect with LR-XCT morphological feature values \({X}_{i}^{\prime},\) \(i=\mathrm{1,2},3,\dots , 9\), by the predicted values \({\widehat{Y}}_{j}\), \(j=\mathrm{1,2},3,\dots , 9\).

A linear relationship between the values of HR-XCT morphological feature j and nine LR-XCT morphological features can be built by the linear regression-based features augmentation model (multiple linear regression (MLR)) [68, 69] as follows:

$${Y}_{j}={\beta }_{j}^{0}+{\beta }_{j}^{1}{X}_{1}+{\beta }_{j}^{2}{X}_{2}+\dots +{\beta }_{j}^{9}{X}_{9}+{\epsilon }_{j}, j=\mathrm{1,2},\dots , 9$$
(3)

where \({\beta }_{j}^{0},{\beta }_{j}^{1},{\beta }_{j}^{2},\dots , {\beta }_{j}^{9}\) are the model coefficients estimated as \({\widehat{\beta }}_{j}^{0},{\widehat{\beta }}_{j}^{1}, {\widehat{\beta }}_{j}^{2},\dots , {\widehat{\beta }}_{j}^{9}\), and \({\epsilon }_{j}\) are the error terms with \({\epsilon }_{j}{\sim } N(0, {\sigma }^{2})\), for \(j=\mathrm{1,2},\dots , 9\). The augmented LR-XCT morphological feature values of the new defect can be obtained as the predicted values \({\widehat{{\varvec{Y}}}}_{j}\) by the linear relationship:

$${\widehat{Y}}_{j}={\widehat{\beta }}_{j}^{0}+{\widehat{\beta }}_{j}^{1}{X}_{1}^{\prime}+{\widehat{\beta }}_{j}^{2}{X}_{2}^{\prime}+\dots +{\widehat{\beta }}_{j}^{9}{X}_{9}^{\prime}, j=\mathrm{1,2},\dots , 9$$
(4)

Similarly, the non-linear relationship can be found by the non-linear regression-based features augmentation model as follows:

$${Y}_{j}={F}_{j}\left({X}_{1}, {X}_{2},\dots ,{X}_{9}\right)+{\epsilon }_{j}, j=\mathrm{1,2},\dots ,9$$
(5)

The function \({F}_{j}(\bullet )\) depicts the non-linear relationship between the HR-XCT and LR-XCT morphological features and is used to augment the LR-XCT morphological features of the new defect by the predicted values \({\widehat{Y}}_{j}\) as follows:

$${\widehat{Y}}_{j}={F}_{j}\left({X}_{1}^{\prime}, {X}_{2}^{\prime}, \dots {X}_{9}^{\prime} \right), j=\mathrm{1,2},\dots ,9$$
(6)

The non-linear regression algorithms used in this study include the random forest (RF) regression [70,71,72] and Gaussian process regression (GPR) [57, 60, 73,74,75].

The mean absolute percentage error (MAPE) is used to evaluate the average percent deviation of the augmented LR-XCT morphological feature values (i.e., the predicted values) from the features augmentation models to the actual HR-XCT morphological features. The augmented LR-XCT morphological feature values with a lower MAPE value are used as predictors for the defect classification in Sect. 3.3.

3.3 ML-driven defect classification

In this section, we propose a data-driven framework to use augmented LR-XCT morphological features for efficient defect classification. ML-based defect classifiers with different classification algorithms are utilized to classify the defects into different types (i.e., KH, LoF, and GEP) based on the augmented LR-XCT morphological features. The defects with similar augmented LR-XCT morphological features are more likely to be classified into the same defect types. This observation is in line with our prior knowledge of the defects in the L-PBF parts: the same type of defects has similar morphology and size.

To conduct defect classification, we use the augmented LR-XCT morphological feature value \({\widehat{Y}}_{j}\), \(j=\mathrm{1,2},3,\dots , 9\), and type (denoted as c) of a defect as the predictors and response, respectively. A function \(G(\bullet )\) is used to classify the defect into its type based on the predictors as follows:

$$c=G({\widehat{Y}}_{1},{\widehat{Y}}_{2},\dots ,{\widehat{Y}}_{9})$$
(7)

The ML-based classifiers \(G\left(\bullet \right)\) used in this study include decision tree [76, 77], random forest (RF) [70,71,72], naïve Bayes [54], k-nearest neighbor (k-NN) [70], and linear support vector machine (SVM) [78, 79]. They will be evaluated by classification accuracy.

4 Case study

4.1 Experiment setup

The L-PBF parts are fabricated by an EOS M290 machine with plasma atomized Ti64 Grade 5 powder (particle size range of 15 to 53 µm) supplied by AP&C — a GE Additive company. The recommended process parameters for Ti64 on this machine are 280 W laser power, 1300 mm/s scanning velocity, 40 µm layer thickness, 120 µm hatch distance, 67° layer rotation, and 10 mm stripe width. To induce the defects (especially KHs and LoFs), we fabricate two parts labeled “K” and “L” with excessive and insufficient energy input by adjusting the laser power and scanning velocity, as summarized in Table 1. The geometry of the fabricated L-PBF parts is shown in Fig. 5, with the upper cylindrical portion machined and scanned with XCT.

Table 1 The process parameters used to fabricate the L-PBF parts K and L and their deviation (in parenthesis) from the recommended process parameters (i.e., laser power: 280W, scanning velocity: 1300 mm/s, and energy density: 44.87 J/mm3)
Fig. 5
figure 5

The geometry of the L-PBF part and the scanned areas of a the HR-XCT scans and b the LR-XCT scans. The scanned area of the LR-XCT scanning is larger than the HR-XCT scanning due to the larger voxel size of the LR-XCT scan. In this case, 200 slices in the middle are selected from the total 1000 slices of the LR-XCT scan for a similar scanned area of the HR- and LR-XCT scans in terms of size and position. The defects (black spots) are isolated from the HR- and LR-XCT scans through binarization

The HR-XCT scanning is performed by a ZEISS Xradia 620 Versa machine with an X-ray source of 160 kV and 25 W power passing through a ZEISS “HE1” filter, while the LR-XCT scanning is performed by the same machine with an X-ray source of 100 kV voltage and 14 W power through a ZEISS “LE1” filter [25]. For both HR- and LR-XCT scans, 1601 2D projections are collected over a full 360 degrees rotation of the scanned area in each scan. The isotropic voxel sizes of the HR- and LR-XCT scans are 1 µm and 5 µm, respectively. It takes approximately 12 h to complete each HR scan and only 3 h for each LR-XCT scan, even though the LR scans cover a volume 125 times larger than the HR scans. Only a portion of the LR scanned area, matching the HR scanned area, is selected.

The defects are isolated [80] from the XCT scans, and their morphological features are extracted. The volumetric tomography data of the XCT scans are reconstructed by the ZEISS Reconstruction software. As shown in Fig. 5, the defects (black spots) are isolated from these reconstructed images of the HR- and LR-XCT scans through a binary function in ImageJ [81]. As summarized in Table 2, the numbers of the isolated defects significantly decrease with the reduced resolution of the XCT scanning, where only 87 and 164 defects are detected in the LR-XCT scans of parts K and L, respectively, compared to 129 and 911 in the HR-XCT scans.

Table 2 The total number of defects in the HR- and LR-XCT scans of the L-PBF parts K and L. The defects with a large major axis length (\(\ge\) 20 µm) in the LR-XCT scans are used as the target defects (TDs) for defect matching

4.2 Evaluation of the proposed framework

4.2.1 Evaluation of algorithmic defect matching model

The matching algorithm described above is applied to the defects in both parts K and L. Note that only LR defects with major axis lengths larger than 20 μm are selected for matching, i.e., the total of 64 and 79 defects are tested. A manual inspection, which verifies each pair of matched defects, is performed by AM experts to evaluate the accuracy of the algorithm. Figure 6 illustrates an example of the manual inspection to verify a pair of matched defects in part K. The TD occurs from the 12th to 20th slice of the LR-XCT scans, and its match identified by the algorithm is observed from the 53rd to 78th slice of the HR-XCT scans. The slice numbers of these two defects indicate their similar positions in the Z-axis since the height of each slice in the LR-XCT scans approximately equals the height of five slices in the HR-XCT scans (1 µm and 5 µm voxels). Furthermore, the relative positions of reference defect 1 (RD 1), reference defect 2 (RD 2), and TD (or its match) are similar in both LR and HR-XCT scans. Therefore, it is concluded that these two defects are correctly matched.

Fig. 6
figure 6

An example of manually verifying a pair of matched defects identified by the algorithmic defect matching model. The target defect (TD) and its match have similar positions on the Z-axis, and similar relative positions of the reference defect 1(RD 1) and reference defect 2 (RD 2) to the TD (or its match) are observed

Figure 7 depicts the accuracy of the proposed matching algorithm with different values of weighting parameter λ. The best accuracy is observed to be 98.73% and 90.65% in parts L and K, respectively, which can be simultaneously achieved for λ ranging from 0.48 to 0.5, indicating roughly equal importance of defect position and volume in defect matching. For the rest of the study, we select λ equal to 0.5. The reasons for incorrect matched defects by the algorithmic defect matching model are discussed in Appendix (see Appendix Fig. 8). 

Fig. 7
figure 7

Accuracy of the algorithmic defect matching model to match the defects in the L-PBF parts L and K with weighting parameter λ varying from 0 to 1. The best defect matching accuracy can be simultaneously achieved as 98.73% (78 out of 79 pairs are correctly matched) and 90.65% (58 out of 64 pairs are correctly matched) for the defects in parts L and K, respectively, when λ equals 0.5

Fig. 8
figure 8

Illustrations of four mismatched pairs of defects computed by the algorithmic defect matching model

4.2.2 Evaluation of features augmentation models

A separate prediction model is constructed to augment each of the features. For each target feature to be augmented, its corresponding HR-XCT morphological feature is used as the response variable, and the LR-XCT morphological features are the predictors. For each of the nine response variables, we construct multiple linear regression (MLR), random forest (RF), and Gaussian process regression (GPR) models. Only those LR-XCT morphological features statistically significant to the responses are used as predictors in the MLR-based models to improve the prediction accuracy. In contrast, all the LR-XCT morphological features are used as predictors in the RF- and GPR-based feature prediction models. Five-fold cross-validation is employed in all cases. Mean average percentage error (MAPE) is used to evaluate model accuracy.

Table 3 summarizes the results. Overall, the average MAPE values of these features augmentation models indicate that most augmented LR-XCT morphological features have an average of approximately 20% deviation from the HR-XCT morphological features (assumed to be ground truth) except extent (42%), flatness (14%), and sphericity (12%). It is observed that non-linear regression-based models outperform MLR in predicting all the derived morphological features except major axis, with smaller MAPE values. Note that since the derivations of these eight morphological features are through non-linear calculations (see Fig. 3), it is reasonable to expect that a non-linear ML model may be more suitable. On the other hand, the linear regression model slightly outperformed non-linear models for the major axis, a directly measured feature. This model has a single predictor as the LR-XCT major axis, which is the only feature being statistically significant to the response and has a relatively stronger linear correlation, and takes the form of:

Table 3 The average MAPE values and standard deviations (in the parenthesis) of the linear regression and non-linear regression-based features augmentation models using MLR, RF, and GPR algorithms on nine morphological features 
$${\widehat{Y}}_{marjor axis}=4.032+1.076{X}_{major axis}$$
(8)

The estimated model coefficients indicate that the major axis lengths of defects are shrunk in the LR-XCT scans. It is worth noting that all models’ performance is very close in all cases.

Though the obtained average accuracy (20% MAPE) can be viewed as relatively low, since our goal is defect classification, we argue (and experimentally demonstrate in the next section) that even low accuracy prediction could still improve the classification accuracy.

4.2.3 Evaluation of defect classification

For the correctly matched pairs of defects, we label the types (i.e., KH, LoF, and GEP) of defects based on their morphologies and sizes in the HR-XCT scans. Five AM experts experienced in defect classification individually labeled these defects in the HR-XCT scans, and only the defects labeled with a consensus (at least four out of five experts) are included in this study. As a result, a dataset of 131 (out of 136 correctly matched pairs of defects) defects, including 31 KHs (23.7%), 73 LoFs (55.7%), and 27 GEPs (20.6%), along with their morphological features, are integrated for defect classification. The dataset is randomly divided into training (70%) and testing (30%), with the same proportion of each defect type as the whole dataset. Five popular ML models, decision tree, random forest (RF), naïve Bayes, k-nearest neighbor (k-NN), and linear support vector machine (SVM) classifiers, have been used for classification and compared according to their accuracy.

Model performance is summarized in Table 4, where the rows correspond to ML models, and the three columns represent the predictors used: LR-XCT morphological features, augmented LR-XCT morphological features, and HR-XCT morphological features. Naturally, predictors trained and tested on HR-XCT scans show the best performance (94–97% depending on the models) since the HR-XCT scans provide the most accurate representation of the actual shape and size of the defects. On the other hand, if LR-XCT data only is available, the defect classification accuracy drops to only 77–82%. Most importantly, for 4 out of 5 tested models (except the decision tree classifier), the introduction of augmented LR-XCT morphological features significantly improves the accuracy. For the best model (i.e., k-NN), the accuracy improves from 82.9 to 90.9%. While augmentation cannot completely overcome the limitations of the LR images (since the accuracy is still lower than what is possible with HR scans, i.e., 94% for k-NN), the proposed framework can vastly reduce the time of defect classification with the time-consuming HR-XCT scanning (12 h for each scan) to the LR-XCT scanning (3 h for each scan).

Table 4 Average defect classification accuracy and standard deviation (in the parenthesis) of the classifiers using the LR-XCT, augmented LR-XCT (ALR-XCT), and HR-XCT morphological features 

Since the k-NN classifiers using LR-XCT and augmented LR-XCT morphological features have the highest average accuracies among all the classifiers using the same predictors, they will be used as the classifiers in the proposed framework. The k-NN classifiers measure the distances among the defects in the multi-dimensional augmented LR-XCT morphological features space and classify them by the majority of defect types among their k nearest neighbors. In this study, ten nearest neighbors (i.e., defects in the training datasets) measured by Euclidean distance are used in the vote to classify the new defects (i.e., defects in the test dataset). Each neighbor is inverse distance weighted (i.e., weight = 1/distance), where the defects closer to the new defects have more impact on the classification result.

To further validate the improvement in defect classification accuracy resulting from the proposed framework, we present the confusion matrices of the k-NN-Augmented LR (ALR) and k-NN-LR classifiers on the test dataset shown in Table 5. Compared to the k-NN-LR classifier, the k-NN-ALR classifier correctly distinguishes all the LoFs from the other two types of defects and identifies more KHs. These improvements indicate that the proposed framework can augment the LR-XCT morphological features to identify more detrimental LoFs, which are more likely to initiate cracks due to their irregular shape and sharp edges [17, 23, 30], and KHs, which negatively impact the fatigue lives of the L-PBF parts more than GEPs with smaller sizes [82, 83].

Table 5 Confusion matrices of classification performance of the k-NN-ALR classifier (a) and k-NN-LR classifier (b) on the test dataset consisting of 8 GEPs, 22 LoFs, and 9 KHs

Lastly, to examine the effect of the proposed framework on classifying the largest defects, which are most important to the structural integrity of the L-PBF parts among all the defects [36, 37], we conduct another defect classification based on the defects with their major axes larger than 50 µm using the k-NN classifier. Note that this is a binary classification since only 17 labeled LoFs and 11 labeled KHs are longer than 50 µm among all the 131 defects used in this study. The average classification accuracy summarized in Table 6 shows that the k-NN-LR classifier has a high accuracy of 96.3%, which indicates that even though the LR-XCT morphological features are less accurate in presenting actual morphologies and sizes of defects, these features can still be used to classify most of the largest defects accurately. Besides, no improvement in the accuracy of the k-NN-ALR classifier over the k-NN-LR one to classify the largest defects is observed, which indicates that the proposed framework is more effective in augmenting the LR-XCT morphological features of defects between 20 and 50 for a higher defect classification accuracy.

Table 6 Average defect classification accuracy and standard deviation (in the parenthesis) of the k-NN classifiers using the LR-XCT, ALR-XCT, and HR-XCT morphological features to classify the largest defects (major axis \(\ge\) 50 µm)

5 Conclusion and future works

In this study, we propose a data-driven framework to augment LR-XCT defect inspection and classification using limited HR-XCT data. This can greatly improve the efficiency and accuracy of LR-XCT inspection and classification, promote the nondestructive inspection of L-PBF parts, and pave the way to understand the impacts of defect on fatigue performance of L-PBF parts. It centers on time-efficient LR-XCT scanning to improve its efficiency and utilizes ML to augment the morphological features extracted from the LR-XCT scans for improved accuracy of defect classification. Specifically, nine morphological features (solidity, sparseness, extent, sphericity, roundness, aspect ratio, elongation, flatness, and major axis length), which describe the morphologies and sizes of defects, are derived to distinguish different defect types and extracted from the LR- and HR-XCT scans. An algorithm for matching the same defects observed in LR-XCT and HR-XCT images is developed, which uses the defect positions and volumes to match with an accuracy exceeding 95%. The LR-XCT morphological features are augmented with regression-based features augmentation models, which build linear (using the MLR algorithm) or non-linear (using RF and GPR algorithms) relationships between the LR-XCT and HR-XCT measurements. It is then observed that for a collection of classification models, using augmentation does indeed significantly improve the classification accuracy. We conclude that the k-NN classifier exhibits the best performance, with 82.9% accuracy for LR features only, which can be improved to 90.6% with augmentation. Furthermore, for the largest defects that are more important for the structural integrity, specifically the fatigue performance, the k-NN classifier can classify them with high accuracy of 96.3% using LR features with or without augmentation.

The classification results showing the types of defects can be used as an effective method to identify and improve the quality of fabricated parts by the L-PBF process. Since large-scale HR-XCT scanning is not feasible for most practical applications due to inherent limitations (long scanning time, limited scanned area, and high cost), the proposed framework can be an efficient but sufficiently accurate way for defect classification with LR-XCT scans.

It should be noted that KH, LoF, and GEP are the only three types of internal defects considered in this research. Some authors distinguish other possible types, such as unmelted powder particles or internal cracking [19, 53, 84]. For the samples manufactured in this study, such less often identified types were not present (as analyzed by the experts involved in labeling the defects). Hence, we believe that this choice does not significantly limit our conclusions. At the same time, it may pose a challenge for application of the proposed methodology to other datasets with other defect types present. Specifically, as defined, the algorithms proposed here are not restricted to the three defect types considered. Indeed, as long as the defect types can be distinguished based on morphological features, the proposed ML methods can be expected to retain some level of the predictive power, and augmenting LR-XCT with HR-XCT can be expected to improve accuracy. For example, internal cracking is reported to be irregularly shaped, and both longer and more elongated than other types of defects with a larger major axis and lower aspect ratio [85,86,87]. Consequently, if such cracking is present in the training dataset (and appropriately labeled), then a new 4-class (KH, LoF, GEP, and internal cracking) k-NN classifier, which can distinguish the internal crack from other types of defects by a combination of morphological features, can be trained according to the framework described here. However, it is difficult to establish a priori whether such a classification problem may be more difficult than the 3-class classification considered so far, and hence, whether the proposed classification method will retain the high accuracy observed here. Given the reported distinct morphological features of cracking, we can surmise that in principle, the approach may be expected to perform well, but ultimately, further experiments are needed to reveal the accuracy in cases where other types of defects are of interest.

In addition to expanding the types of defects considered, a number of other promising future research directions can be proposed, including.

  1. 1.

    Evaluate the proposed framework by the defect classification accuracy with a large number of defects in the newly fabricated L-PBF parts.

  2. 2.

    Enhance the accuracy of the algorithmic defect matching model by filtering the target defects in the LR-XCT scans and the features augmentation models by using more features extracted from the XCT scans.

  3. 3.

    Apply the proposed framework to search for the optimal fabrication conditions of L-PBF parts through the defect classification results.