1 Introduction and motivation

The metal additive manufacturing (AM) industry is an evolving technology for metal builds such as machinery, aerospace, and medical parts. Metal AM is advantageous in terms of manufacturing complex build parts with intricate geometrical design, reduced weight and material waste, and increased production volume [1]. Due to these advantages, AM is revolutionizing the biomedical, aerospace, and automation industry [2, 3]. Metal AM can be achieved using powder bed fusion [4], directed energy deposition (DED) [5, 6], and sheet lamination [7]. Variations of the DED system include different energy sources, deposition material, and its forms, such as alloy powder and wire [8, 9]. This paper focuses on a directed energy deposition process for Ti-6Al-4 V material. A laser power source with Ti-6Al-4 V heated wire feedstock material is concentrated on the substrate to melt the wire. The development goal in AM industry is in situ quality control of printed parts without defects and favorable building geometry accuracy. Deviations in the build geometric parameters can cause defects in the final deposition quality. The induced manufacturing defects are detrimental to the overall build quality.

The uncontrolled printed bead’s geometry dimension and accuracy can cause the part’s accumulative residual stress, part distortion and further magnify as cracks and porosity defects during AM process. In situ quality control has drawn much attention to avoid manufacturing defects and improve part quality. Work has been done to improve geometry quality by monitoring the molten pool sensing data, and research is still ongoing in this field. [10] developed a two-input single-output hybrid control system to control height growth and molten pool temperature at each deposition layer. [11] developed a proportional-integral-derivative (PID) and fuzzy logic-based model to control the height of deposit using the measured information from an in-process monitoring camera that is dominated by the changes of laser power. [12] developed a control architecture for temperature and built height using a pyrometer and camera. Laser power and stand-off distance were used as process control parameters, and the results were analyzed by comparing the build performance with and without control. [13] developed a feedforward clad height controller for the laser solid freeform fabrication process, and the results were evaluated experimentally. [14] designed a multi-variable control model for controlling the layer height and molten pool temperature using laser power and scanning speed.

The literature for controlling the final parts build geometry is based on the assumption that bead geometry is related to molten pool dimensional and temperature information and which in turn can be controlled by the process parameters. With this assumption, the existing control is to achieve a steady molten pool by adjusting process parameters. However, such control is indirect as it cannot estimate the specific bead geometry properties but keeps them stable. To achieve specific bead geometry properties, direct modeling between molten pool sensing data and bead geometry is crucial for real-time control of the bead deposition quality. Considering the complexity of the laser wire-feed AM process, it would be ideal for achieving direct in situ bead geometry control using the molten pool sensing data. If the fluctuations in bead geometry can be monitored in real-time, it would be more conducive to control the overall bead quality. Thus, in-process geometry monitoring is crucial for efficient and good in situ quality assurance.

Bead geometry modeling and prediction has been mainly performed with thermal history, energy input conversion, or experimental data using empirical techniques. Some researchers have performed the printed bead geometric properties estimation using numerical, analytical, and finite element modeling (FEM) in the past few years [15, 16]. However, there are two major limitations of these traditional methods. Application of the developed simulation model to real-time sensing might not concur with the final bead geometry prediction. The reason being the actual process deviates, and sensing data are noisy and have measurement uncertainties. Usage of actual system data as input to the simulation model will result in uncertainty or inaccuracy in final property prediction. The second limitation is that the high computation cost of traditional methods limits its real-time prediction capability.

Machine learning (ML) techniques such as deep neural networks, multilayer perceptron, regression modeling are recently adopted in the AM field for monitoring and control. Convolutional neural networks (CNN) have proven effective for using molten pool image data for meaningful feature information extraction and processing for bead geometry dimension modeling and prediction. For example, [17] compared the linear and non-linear regression model’s performance for predicting the build’s length, width, and thickness. The input to the model includes part orientation, STL properties, and part placement. [18] presented a passive two-camera vision system for real-time prediction of bead height and width. Image processing and filtering techniques were applied to the camera images, and the experiments were performed to validate the method for the gas metal arc welding (GMAW) setup. [19] predicted the reinforcement and penetration depth of the weld by using image data and optimized Resnet34 as the network structure. [20] developed six different CNN architectures to analyze the molten pool images to yield the prediction of clad bead height and width. [21] presented a welding case study to predict the backside bead width using the images through a CNN and recurrent neural network (RNN). [22] developed a multilayer neural network (NN) and second-order regression model to predict the bead height and width using process parameters (PP) as input for the GMAW system. In comparison, [23] developed a regression model for predicting the bead height, width, and depth by using process parameters as input. Current work for geometric parameters prediction involves the ML approach to use single sensing data or process parameters for property prediction.

One key for training a good ML model is the quality and size of the training dataset. However, conducting the printing work as well as the measurement of post-processing characterization data is costly and time-consuming. It requires sophisticated high-end complex machinery with tedious and manual labor to measure the deposited build geometric and microstructural properties. In addition to this, the bead geometry measurement requires destructive analysis involving the cross-sectioning of the printed measurement. Because of this limitation, the current ML-based geometry estimation models were mainly developed with simplification. The modeling and control architectures for bead geometry property are mainly based on a single-input system. The process is simplified with a single sensing parameter and is focused on the limited number of geometry properties such as bead height, width, and/or penetration depth. The build geometry width and height estimation are relatively easy which can be measured without bead incision. The cost of bead fusion zone depth and area measurement is relatively high due to sample preparation and characterization. Because of these simplifications, current ML-based geometry estimation cannot provide comprehensive geometry monitoring. Table 1 summarizes the literature for build geometry parameter estimation reviewed in AM domain.

Table 1 Overview of machine learning-based approach techniques for build geometry estimation

To solve these issues, a multi-modality model is developed for the comprehensive prediction of four geometric parameters: bead height (H), bead width (W), fusion zone depth (D), and fusion zone area (A). The contributions for the current work are as follows: (1) Experimental data is collected for different settings of process parameters on a wire-feed laser additive manufact uring system for single-bead deposition. (2) Post-process characterization is performed on the deposited bead for geometric properties measurement. (3) Multi-modality CNN model is designed to predict the bead geometric characteristics using the input features extracted from real-time molten pool image and temperature data spectrum. (4) Analysis is performed to characterize the effect of different thermal profiles on the prediction performance of the CNN model. (5) The optimized CNN model is compared and analyzed for performance accuracy with traditional support vector regression (SVR) regression modeling with cross-validation technique.

2 Experiment and instrumentation

2.1 Sensor integration and data acquisition system setup

The WLAM DED system has been developed and installed at Oak Ridge National Laboratory in Knoxville, Tennessee. A 6 kW laser is delivered to the end effector of the robot arm in the presence of argon filled environment. The feedstock is 1.5875 mm Ti-6Al-4 V welding wire per AMS 4954 K specification. The laser WLAM DED robot setup, along with the mounted sensors, is shown in Fig. 1. The sensors were selected to capture as much data as possible from the process during operation. There were five categories of data collected: (1) visual, (2) thermal, (3) positional, (4) chemical, and (5) acoustic. Two Prosilica GT1930C cameras were mounted to the robot head, one directly coaxial with the process and the other at a 90° oblique angle to the primary direction of travel. The CMOS camera is connected using an Ethernet interface, recording 1936 × 1216 pixels images at 25 frames per second (fps) using the NI PXIe-8234 vision module.

Fig. 1
figure 1

Integrated laser hot wire-feed DED system

The main issue in the laser-based AM process is that the image contrast for the molten pool is too bright to capture the surface morphology directly. Hence, bandpass filters are mounted in front of the camera to reduce the intensity. There are three pyrometers with a temperature range of 50–400 C, 200–1500 C, and 1000–2000 C to measure the leading (Optris CTlaser 3 M), trailing (Optris CT XL 3 M), and molten pool (Optris CTlaser 05 M) temperatures, respectively. The pyrometers are calibrated for emissivity using a heated plate and physical contact measurements with thermocouples. Note that the molten pool (MP) pyrometer could not be easily calibrated using a similar method, so the presented data are considered relative and not absolute. The leading edge (LE) and trailing edge (TE) pyrometers are pointed approximately 25 mm in front of and behind the molten pool. Temperature data is collected at 100 Hz using NI PXIe-4302 analog input module. An acoustic sensor is mounted on the laser head operating a frequency of 1 kHz to capture variations during the build. Analysis of the sensor signals relative to the process during stable and unstable operations will determine how each sensor can be used in the control logic. The National Instrument (NI) industrial controller NI PXIe-8880 along with the vision development module and analog/digital I/O module is used for monitoring and controlling the laser DED system through LabVIEW. This paper focuses on studying the in situ sensing data from the coaxial camera and three pyrometers.

2.2 Data preparation and processing conditions

In this study, thirteen experiments were conducted for different values of process parameters, i.e., laser power (LP), travel speed (TS), wire feed rate (WFR), and hot wire power (HWP) for a 100-mm single-bead deposition. Table 2 shows the combination of process parameters used for data collection of WLAM printing and the corresponding geometric properties. The collected characterization data for single-bead deposition include four geometric parameters. The measured geometric properties are bead height (H), bead width (W), fusion zone depth (D), and fusion zone area (A). The process parameters vary in the range of 4000–6000 W, 3.5–10 mm/s, and 40–71.3 mm/s for laser power, travel speed, and wire feed rate, respectively. The molten pool dynamics during material deposition consist of both the steady and transient states. The pixel range for the image data from the coaxial camera is enormous, and it covers a vast portion of the unwanted region. The molten pool image is preprocessed with a selected region of interest (ROI) and cropped to reduce the image size without missing any information and keep the data within the hardware processing capability. Hence, the coaxial camera images are cropped to 481 × 566 to be processed and trained in MATLAB. The dataset consists of 6500 images from 13 builds, containing 500 stable-state molten pool images from the coaxial camera and temperature data from each build. From the 13 build dataset, 11 builds are used for training the network, one build for the testing, while the remaining one build is used for validation.

Table 2 Process parameter setting for the collected sensing data and geometric properties

The bead geometry of the printed single-bead depositions from the WLAM setup is quantified. The bead characterization process used in this study involves no heat treatment on the printed samples. The printed bead is incised to identify the geometrical properties of the bead by cross-sectioning it. These specimens were first polished with SiC papers on a Struers LaboForce 100 machine. The bead geometrical property calculation such as bead height, bead width, fusion zone depth, and the area is performed on the Keyence VHX-5000 optical microscope. The dimensional and geometric information for a single-bead deposition from a cross-section is exhibited in Fig. 2.

Fig. 2
figure 2

The experiments data collection from, a WLAM system operating under a set of controlled process parameters, b Printed bead for different settings of process parameters, c Characterization analysis for bead geometry measurement

2.3 Thermal data selection strategy

By increasing the number of features, feature engineering increases the problem’s dimensionality, leading to the “curse of dimensionality” [24, 25]. It is recommended to analyze the insignificant thermal data obtained from experiments to reduce the network’s complexity during the training process. The real-time sensing data is noisy or has disturbances caused by the environmental and sensing system. Hence, thermal feature selection is crucial for identifying the most relevant features responsible for mapping the input data to the output geometric properties. Feature selection also helps in improving the CNN model’s performance by analyzing the relationship between thermal data and geometric properties. Feature correlation and redundancy were evaluated using Pearson correlation [26]. The Scikit-learn python implementation of these algorithms was used by [27]. Correspondingly, Pearson correlation between feature pairs \({r}_{{x}_{ij}}\) or feature and property \({r}_{xy}\) uses the standard definition,

$${r}_{xy}=\frac{{\sum }_{i=1}^{n}\left({x}_{i}-\overline{x }\right)\left({y}_{i}-\overline{y }\right)}{\sqrt{{\sum }_{i=1}^{n}{\left({x}_{i}-\overline{x }\right)}^{2} }\sqrt{{\sum }_{i=1}^{n}{\left({y}_{i}-\overline{y }\right)}^{2}}}$$
(1)

where n is the sample size, \({x}_{i}\) and \({y}_{i}\) are the individual sample points, and \(\overline{x }\) and \(\overline{y }\) are the sample means.

2.4 Modeling methodology

2.4.1 Convolutional neural network

Figure 3 shows the Sensing-Geometry (S-G) CNN model with molten pool sensing data from the pyrometers and camera as inputs and outputs the corresponding build geometric properties based upon the sensor data. In situ sensing data is collected from a WLAM system installed with sensors to record the molten pool evolution during printing. Next, the sensing image data collected during the 13 builds is synchronized with the corresponding molten pool temperature data points, as shown in the input block (Fig. 3). The effects of molten pool features on quality characteristics and their correlations are analyzed. CNN architecture performs automatic feature extraction via convolution and pooling layers to obtain the most meaningful features from the molten pool image data. The fully connected layer takes extracted image features in conjunction with the molten pool temperature for geometric property estimation. The network is trained on the steady-state molten pool sensing data for estimating the geometry parameters. The trained S-G network can then predict the quality properties directly from unseen real-time sensing data collected during the build.

Fig. 3
figure 3

Sensing-geometry relations modeling using m-CNN for geometric parameter prediction using molten pool images and temperature data

CNN is advantageous in terms of multi-dimensional image data, learning intricate details and features from the input responsible for output tasks, either classification or regression problems [28,29,30,31]. CNN architecture is composed of different layers such as convolution layer, normalization layer, activation layer, pooling layer, and fully connected layer for output property classification or prediction. The image data is first processed using a filter generally known as the kernel. The convolutional layer is employed for feature extraction using the user-specified filter and stride value. The kernels are weights that are updated continuously as the network learning process progresses. The generated feature map is represented in Eq. (2) as,

$$F\left(i,j,k\right)=\sum_{m=0}^{M-1}\sum_{n=0}^{N-1}\sum_{l=0}^{L-1}E\left(i+m,j+n,l\right){K}_{k}\left(m.n,l\right)+{b}_{k}$$
(2)

where E is the extracted features from the previous layer, K_k is the applied filter, b_k is the bias, (M, N, L) is the size of the filter, and L is the 3rd dimension of the previous feature map layer.

The output feature map is the response of the input image data to the features specified by the kernel. The addition of a more convolutional layer along the hierarchy results in the identification of detailed abstract features responsible for output prediction. A normalization layer is applied to deal with non-uniform scaling and the image data and covariant shift in the layers. The addition of a normalization layer is also shown to speed up the training process [32] and reduce the problem of network malfunction caused by learning rate and overfitting issues. The most common activation functions used are sigmoid, hyperbolic tangent, ReLU, clipped ReLU, etc. In the current work, a clipped ReLU activation function is used to introduce non-linearity. Finally, a pooling layer reduces the dimensionality added by the different layers and makes the generated feature map less insensitive to feature location by introducing local invariance. The pooling layer downsamples the data to decrease the computational volume of data and improve the robustness of the algorithms. The standard pooling layers used are max pooling, global and average pooling based upon the application at hand. The output from the last pooling layer is a feature vector used as input to the first fully connected layer. The output from the fully connected layer is passed to a fully connected network for final data prediction. The activation function, pooling layer type, and the number of layers in CNN are user-defined parameters selected based upon the best networks performance.

The layers are connected in a user-defined manner to suit the specific application and dataset. The designed and optimized network is trained using the gradient descent method used by back-propagation. The hyperparameters used to train the architecture are adjusted as the training progresses to accurately predict the output class or quantitative values. The network’s objective is to minimize the mean square error, which is training loss, as specified in Eq. (3),

$${Loss}_{MSE}=\sum_{1}^{N}l({y}^{n},{\widehat{y}}^{n})/N$$
(3)

where \(l\) is the loss function, \({y}^{n}\) is the target value, \({\widehat{y}}^{n}\) is the predicted value, and N is the size of the training data.

2.4.2 Support vector regression

Another widely adopted ML technique for modeling complex data is the regression approach. Regression modeling is a predictive technique for mapping the relationship between the input-independent data and output-dependent target [33]. The most common regression methods are linear, polynomial, logistic, support vector, and Gaussian process regression. SVR is a popular supervised machine learning analyzing data for classification and regression tasks [34]. After performance comparison for different traditional regression models, the SVR was shown to give the best performance among these regular regression models. In order to quantify the improvement of the automatic feature extraction technique with the traditional manual approach, a regression model is selected as benchmark comparison with proposed CNN architecture.

SVR works by minimizing the generalized error bound instead of minimizing the observed training error to achieve generalized performance. The generalization error bound is the combination of the training error and a regularization term that controls the hypothesis space’s complexity. The goal is to search for a function that predicts the output based upon the prediction error between the actual and predicted value [26]. The kernel function is used to fit the input features, while the cost function works by searching the optimal weight parameters to minimize the total accumulated error. With no prior knowledge about the complex relationship between the molten pool data and geometric properties, a non-linear SVR with a linear kernel function is used for mapping the input to the output. MATLAB version 2020b is used to implement CNN and SVR models.

2.5 Prediction evaluation metric

The accurate prediction of geometric properties requires the network to be optimized in terms of structure, input, output, and generalization capability. However, the addition of more layers into the CNN does not always result in features of high quality. The reason being the models degenerate as the network structure deepens or the gradient shoots or dissipates. The training of the CNN updates the network’s weight as the gradient is calculated using back-propagation. Poorly structured CNN architecture results in the gradient to vanish or shoot, causing the model to have lower accuracy and a slower learning rate. Thus, the CNN structure is optimized in terms of layer architecture and hyperparameters to suit the training dataset for the input–output combination. The m-CNN using the image and thermal profile as input is trained to predict the build geometric properties of the single-bead deposition.

The designed CNN architecture may return higher accuracy under specific training and testing dataset. However, the performance can deteriorate for the test samples if the training dataset does not capture the system dynamics. Cross-validation is a popular technique to deal with performance accuracy resulting in uneven dataset distribution for the training and testing data. The current work uses sixfold cross-validation, where the overall 12-build dataset is divided into six parts, with five parts used for training and the remaining 1 part for testing. The process is repeated six times so that each section is used for testing once. The generalization capability of the model is evaluated using sixfold cross-validation. A total of 5500 image and temperature data from 11 builds are used for training the network and tested on the 500 unseen samples from one build.

The prediction performance is compared using the following evaluation metrics: root mean square error (RMSE) and relative percentage error (RE) between the actual and predicted properties, given in Eqs. (4) and (5) respectively. \({y}_{i}\) is the measured, and \({{y}_{i}}^{\mathrm{^{\prime}}}\) is the model predicted property value. Normalized RMSE (NRMSE) is another error evaluation metric generally used for comparison between properties with different scaling, as given in Eq. (6). The RMSE is normalized to 1.6483 mm for bead height, 3.6976 mm for bead width, 0.7970 mm for fusion zone depth, and 26.7311 mm2 for fusion zone area. Standard deviation (SD) for the sixfold cross-validation is also evaluated for comparison

$$RMSE=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left({{y}_{i}}^{\mathrm{^{\prime}}}-{y}_{i}\right)}^{2}}$$
(4)
$$RE=\frac{1}{n}\sum_{i=1}^{n}\left|\frac{{{{y}_{i}}^{\mathrm{^{\prime}}}-y}_{i}}{{y}_{i}}\right|\times 100\mathrm{\%}$$
(5)
$$NRMSE=\frac{RMSE}{\left({y}_{max}-{y}_{min}\right) }\times 100\mathrm{\%}$$
(6)

The m-CNN for geometric property prediction is trained using mean square error as the loss function between the actual and predicted geometric properties. The quantitative metric for validating the testing performance is the prediction accuracy comparison for the trained network. Another evaluation criterion is the convergence achieved by the gradient of the loss function. The network’s weight is altered to minimize the loss function by lowering the mean square error. The hyperparameters of the architecture affecting the convergence and back-propagation are the momentum, epoch, batch size, learning rate, and velocity. Table 3 shows the values for optimized hyperparameters used for training the geometric CNN model.

Table 3 Optimized hyperparameter values for the CNN-based S-G model architecture

3 Results and discussion

3.1 Feature extraction and selection

Feature analysis is crucial to determine the relevant sensing data to efficiently train the network for geometric property prediction. The real-time sensing data used for Pearson correlation is MP dimensional parameters, i.e., width and length, and three temperature measurements. The MP dimensional information is extracted from the video by performing image by image analysis for the SVR model. The flowchart for the width and length measurement is presented in Fig. 4a. The images from the camera are accessed frame by frame to extract the red plane as a reference. The extracted red image plane is applied with thresholding for removing noisy data. After thresholding, the next step is edge detection, which is based upon the threshold applied and filtered for detecting the edges. There are five basic types of filters, i.e., Robert, Prewitt, Sobel, differentiation, and gradient, each suitable for a specific application [35]. Robert filter was applied to get the most accurate representation of the molten pool. Once the edges are extracted, the region of interest is defined for width and length measurement. Figure 4b shows the extracted width and length plot in millimeter for one single-bead deposition experiment. The extracted width and length are used along with the MP temperature to predict the characterization data using the SVR model used as a baseline.

Fig. 4
figure 4

Image processing for molten pool dimension extraction, a Flowchart for dimension measurement, b Molten pool width and length plot for single-bead deposition

Molten pool dimensional parameters are extracted from the image data as described above. The thermal data include leading edge (LE), molten pool (MP), and trailing edge (TE) temperature. The final geometric properties for correlation analysis are bead height, bead width, fusion zone depth, fusion zone area. Table 4 represents the Pearson correlation matrix between the sensing data and geometric properties. As seen from the correlation matrix, both MP and LE temperature are highly correlated with the geometric properties. The Pearson correlation analysis identifies only the linear relationship between the sensing data and the characterization properties. Based on our previous work [36], MP temperature is shown to improve performance accuracy compared to the LE temperature, probably due to its non-linear relationship with the process parameters and quality properties not recognized by the Pearson correlation. Hence, for the CNN architecture design, MP temperature is used in conjunction with MP image data for geometric quality property prediction. The CNN directly uses camera images as input to the network with minimal data preprocessing. For the regression modeling, MP width and length extracted from the image data are used to input the model along with the MP temperature. The output properties for model prediction are the four geometric bead parameters.

Table 4 Pearson correlation matrix between the sensing data and characterization parameters

3.2 Accuracy of CNN model and comparison with the regression model

The m-CNN architecture for predicting the geometric parameters with the best accuracy using the image and MP temperature data consists of a total of 24 layers, where the first 21 layers are used for image feature extraction. The remaining three layers are for predicting the final geometric properties along with the temperature data feature. The network uses batch normalization, clipped ReLU activation function, and a dropout of 50%. The m-CNN consists of four convolution layers interlaced with two global average pooling layers. Following the second global average pooling, the resulting output is unrolled into a vector and fed into a fully connected layer of dimension 100, which is reduced to a size of 3 before concatenating with the temperature feature. The final layer consists of 4 nodes, based upon the four geometric parameters’ prediction. Figure 5 represents the CNN architecture, where the image features are extracted and concatenated with the temperature data for geometric property prediction. The S-G model is trained for an epoch of 10 using 5500 training samples, and 500 unseen test samples of image, and temperature data collected during 13 builds for the molten pool condition. The m-CNN based S-G model uses sixfold cross-validation, and the results discussed below are for the average error of the sixfold cross-validation unseen test dataset.

Fig. 5
figure 5

Multi-modality convolutional neural network architecture for geometric property prediction using molten pool images and temperature data. The different layers in the network are Convolutional Layer (CL), Batch Normalization Layer (BNL), Clipped ReLU Activation Layer (CRAL), Global Average Pooling Layer (GAPL), Dropout Layer (DL), Fully Connected Layer (FCL), Concatenation Layer (Con.L), and the final output layer is a FCL with four output geometric properties

The m-CNN’s model performance is compared to a regression modeling technique used as the baseline. Figure 6 represents the regression modeling framework for build geometric property prediction. Firstly, the molten pool dimensional information, width and length, is extracted from the MP image as described in the “3.1” section. Then, the extracted dimensional information is used in conjunction with the temperature data as input to the support vector regression model. The output for the regression model is the same as the four build geometric properties used for CNN modeling. Note that the CNN structure and the regression models are optimized to compare their best performance outcomes.

Fig. 6
figure 6

Sensing-geometry regression model for geometry property prediction using molten pool dimensional information and MP temperature data

This section analyzes the effect of features extracted from the image data by CNN vs. the traditional feature extraction method. Table 5 shows the performance prediction comparison between the CNN and regression modeling technique. For m-CNN model, the RE is relatively higher for bead’s geometric shape height and fusion zone area; this is caused by their lower correlations with regard to the MP geometric and thermal profiles, as indicated by Table 4.

Table 5 Comparison between m-CNN and regression model for predicting the geometry properties using RMSE, RE, NRMSE, and SD

The bead width error is lower for the regression model since molten pool width is used as model input, which is strongly correlated to the printed bead width. The prediction error of fusion zone depth is lower for the regression model. It may result from a relatively stronger coupling effect of fusion zone depth with the molten pool surface length and width than bead height and fusion zone area. Prediction error for fusion zone area is lower for the CNN model as the molten pool image data provide more information than just molten pool width and length for area prediction. The SD for all the geometrical parameter predictions is lower for the CNN model showing that the prediction is clustered around the mean, making the model reliable. Geometric property prediction shows that the performance of the CNN model is overall better than the regression model. Automatically extracted features from CNN provide a more accurate representation of the geometric properties.

With the trained ML models’ prediction, the comprehensive geometry prediction framework can estimate the bead geometric parameters using the molten pool condition. Figure 7 shows the CNN models prediction for the 12-build dataset specified in Table 2. The four geometric parameters are estimated and compared against the measured value for analysis. It can be seen that the predictions are reasonably within the 5% and 10% tolerance band (TB), as seen from Fig. 7 for all the geometric parameters. The bead height prediction is worse compared to the remaining three geometric parameters, which is in accordance with the NRMSE value for the results discussed in Table 5. The main reason for the worse performance of bead height prediction is that one data sits outside the tolerance band with a measured value of 2.93 mm and a predicted value of 3.74 mm. The corresponding experiment is #6 as indicated in Table 2, with a process parameter combination LP = 4500 W, TS = 10 mm/s, WFR = 50.8 mm/s, and HWP = 300 W. It has a travel speed of 10 mm/s that is much higher than and different from other training experiments datasets. The considerable error of likely occurred in process space regions with fewer nearby experimental data points.

Fig. 7
figure 7

Summary of m-CNN geometric model measured vs. predicted value for bead height, bead width, fusion zone depth, and fusion zone area. Uncertainty in prediction is presented using a 5% tolerance band (orange) and a 10% tolerance band (yellow) for the measured value

The CNN-based S-G model can be used in real-time for in-process, comprehensive bead geometry property estimation. The model can take in the sensing data from the camera and pyrometer to predict the four post-process geometry properties in real-time. The CNN learns intricate molten pool dimensional features directly from image data apart from just the MP width and length used as input to the SVR model. Thus, the convolutional network’s capability to automatically map input image features to the output geometric property prediction is advantageous.

3.3 CNN model validation using experimental data

In order to verify the utility of the trained CNN model, the model is validated using one set of experimental data. The process parameter combination for validation is LP = 4500 W, TS = 5 mm/s, WFR = 50.8 mm/s, and WFR = 300 W. The experimental characterization for the four geometric parameters is shown in Fig. 8, along with the deposited bead. The quantitative m-CNN model validation results are summarized in Table 6.

Fig. 8
figure 8

The experimental data geometric properties for the process parameter combination, a The single-bead deposition, b The cross-section of the printed bead for geometry parameter measurement

Table 6 The experimental and CNN model summary for validation of the geometric parameters

Table 6 shows the combination of process parameters and the corresponding bead height, width, fusion zone depth, and area value for the experimental data obtained using characterization. The molten pool image and temperature data for the corresponding process parameter are used as input to the CNN model for geometric parameter prediction. The predicted geometric parameters are in close agreement with the experimental data, considering the corresponding standard deviation value. The percentage error in prediction is 14% for bead height, 14% for bead width, 8% for fusion zone depth, and 9% for fusion zone area. The prediction error is relatable with the sixfold cross-validation results detailed in Table 5 and the worst performance seen for the bead height parameter. Also, the Pearson correlation matrix, as specified in Table 4, identified the least correlation value of 0.24, 0.16, and 0.08 of the bead height with the MP width, length, and temperature, respectively.

3.4 Property variations against molten pool shape and temperature

The effect of change in process parameters on thermal profile governs the final build characterization properties [37, 38]. The effect of temperature data is studied by comparing the performance prediction of the models with different thermal data as input. For all the models, molten pool image data is used as input to the m-CNN model along with three individual temperature measurements. Figure 9 depicts the performance comparison of the three CNN models using RMSE, SD, and NRMSE. The Pearson correlation analysis represented in the “3.1” section identifies the highest correlation of the MP and LE temperature with the geometric properties. The SD for the model developed using image and MP temperature is lower than models implemented using leading and trailing edge temperature. The lower value of SD is preferred in terms of model robustness and reliability. This shows that the temperature where the laser meets the metal substrate contains relevant information useful for geometric property prediction, and both linear and non-linear correlations between the MP temperature and the bead geometry exist. Pearson correlation only shows the linear correlation, which cannot be used as the sole criteria for feature selection. In contrast, CNN automatically captures the non-linear correlation, which can improve prediction performance.

Fig. 9
figure 9

RMSE, NRMSE, and SD comparison for m-CNN S-G models with different temperature measurements as input along with image data

4 Conclusion

The current work presents a multi-modality CNN architecture for establishing the correlation of the real-time sensing data to the final bead geometric properties. A sensing and data acquisition system is designed for WLAM system for experimental data collection under controlled process parameters. The data is collected for single-bead deposition, which is characterized for calculating the bead geometric properties. The geometric properties used in this study include bead height, bead width, fusion zone depth, and fusion zone area. Molten pool sensing data and geometric properties are analyzed using Pearson correlation to identify relevant and meaningful features for modeling. Based upon the correlation matrix and domain knowledge, MP image and MP temperature are used as input for the CNN modeling. The model is trained to comprehensively predict the four build geometric properties, which are indicators for final build quality. The S-G model’s prediction error is 8.04%, 4.53%, 4.14%, and 8.74% for bead height, bead width, fusion zone depth, and fusion zone area. The performance of the m-CNN model is compared with a regression modeling approach. The regression model uses the MP dimensional length and width features instead of the raw MP image data for geometric property prediction. It is observed that the model trained using raw image data provides more information for geometry estimation than the MP dimensional features extracted with traditional methods. The trained S-G model allows the development of real-time prediction of the bead geometric properties using the sensing data from the system. The designed CNN model can be used as an in situ quality control framework for monitoring and controlling the bead geometry in real-time.

This paper has been focused on validating the proposed CNN framework that can perform in-process bead geometry property prediction for a laser wire-feed DED system. We have shown that incorporating temperature profiles as an external feature to the molten pool image improves prediction performance compared to the image only CNN approach. Going forward, follow-up research would be to test and evaluate the generalizability of the built models in a larger machine process operational range. This is feasible and captured in the machine knowledge transfer through Bayesian networks for the new machines and new process setting range in the authors’ other work [39]. In [39], a machine learning model is built with selected machines and tested in the untrained process parameter domain on different machines for hardness and density prediction. More broadly, multi-objective optimization, active experiment design, and in-process quality controlling strategies are active areas of study, such as acceleration of process optimization and multi-property design quality assurance [8, 39, 40] by leveraging the prediction power of the ML models. This paper serves as a starting point that can be expanded in subsequent active experiments designed to improve prediction accuracy and accelerate process optimization.

In addition, this paper has validated the feasibility of an in-process CNN property prediction framework under a single-bead deposition scenario. The authors will extend the model for predicting the bead properties in response to multilayer building scenarios, such as angled wall and tee geometry structures, for broader applications in real practice. Single-bead geometry prediction accuracy in this paper is the top priority of high-quality printed parts since it warrants more complex desired properties, such as surface smoothness, part shape distortion, thermal/stress accumulation, defect formation, and associated mechanical property variations. The multi-modality CNN method is expected to outperform the single modality CNN method when handling more complex geometries. With enough data with ongoing work on multilayer printing, we will expand the current work to make part geometry predictions. The multilayer printing will also consider additional geometric and mechanical properties such as geometry distortion and dislocation stress–strain evaluation arising from the previously deposited layer.