1 Introduction

The process of additive manufacturing (AM) involves material arrangement layer upon layer under computer control to build a three-dimensional object [1]. AM is carefully governed by process controls that have minimised the dependency on skilled labour and human intervention as envisioned in Industry 4.0. Due to the unique capabilities of fused deposition modelling (FDM), like low cost, high speed, and simplicity of the process, it is one of the highly demanded techniques in AM [2]. However, its application to build a functioning part is restricted due to irregular surfaces, weak mechanical properties, layer–layer appearance, and inadequate accuracy [3]. Thus, experimental techniques were employed to investigate the effects of FDM process variables on the geometrical properties of the manufactured parts [4]. The benchmark geometry establishes a common baseline for comparing and fine-tuning various procedures. It has certain characteristics and dimensions that ensure that the operating capabilities are thoroughly assessed [5, 6]. The benchmark part was developed to study the in-plane consistency of FDM machines and to showcase the impact of shrinkage, nozzle temperature, and build speed on the accuracy of prototyped components during the process [7]. Identifying important factors and determining optimum process parameters improve the quality of the fabricated components in terms of dimensional deviation. Machine learning (ML) involves building and studying systems that can automatically learn patterns from data. Machine learning (ML) aided models, according to research, are capable computational technologies that allows AM processes to attain high-quality standards, prediction, performance optimisation, product consistency, optimum process response, classification, regression, or forecasting [8]. Building an ML model leads to a statistical regression equation that predicts the output based on the input values. Hence, the data utilised to train the ML model are the most important aspect in determining its effectiveness [9].

This paper focuses on a framework for using machine learning to forecast geometrical dimensions and dimensional variation of the parts fabricated by FDM. The benchmark part was fabricated by varying FDM process parameters based on L9 orthogonal, and the dimensional deviation of individual features was evaluated with CMM. Further ML algorithms were employed to predict the dimensional deviation of the individual features.

2 Experiment

2.1 Design

To generate input data set for the proposed ML, parts with multiple geometric features and with different dimensions must be fabricated. For this purpose, the NIST Benchmark [10] was taken, and an altered version was designed in SOLIDWORKS, as given in Fig. 1. A description of features incorporated in the benchmark component is listed in Table 1.

Fig. 1
figure 1

Benchmark components designed in CAD software

Table 1 Benchmark model features description

2.2 Fabrication

Using acrylonitrile butadiene styrene (ABS) material, all the benchmark components were fabricated in an FDM machine (Accucraft i250 +). Other parameters such as infill, extrusion width, bed temperature, filament diameter, and style of infill were set as 100%, 0.5 mm, 80 °C, 1.75 mm, and straight respectively. Nozzle temperature, layer thickness, and nozzle speed were taken as variable factors, and the Taguchi L9 orthogonal array (Table 2) was chosen to provide a non-redundant combination of factor levels.

Table 2 L9 Orthogonal array, 3^3

A contact type coordinate measuring machine (CMM) was used to determine the dimensions of each geometrical feature of the fabricated benchmark components (Fig. 2). Thus, each experimental run had 79 measurements evaluated with three repetitive readings, and the average dimensions served as an input dataset of size 79 X 9 (in.csv format) for ML.

Fig. 2
figure 2

Fabricated benchmark components

2.3 Machine Learning

With wider access to open-source libraries like NumPy, SciPy, Scikit-Learn, and Matplotlib, Python 3.8 was chosen as a programming language. Jupiter notebook was leveraged as the development environment to build the regression algorithm. It involves feature selection of independent variables as X's from the input files to arrive at a transformation function (Regression Equation) to estimate the dependent variable as Y (Prediction value) [11].

The steps carried out to build the ML model are as follows:

  • Data Exploration—Measured data from CMM were investigated and comprehended. Based on the type of dimensional accuracy to be measured, the geometrical feature dimensions were divided into six categories: length, width, height, diameter, angle, and thin slots. Understanding the data fields (features) and their interactions is essential in managing data that goes into the model as they form the input KPIs (Key Performance Indicators) representing the data. Analysis and the graphical plot of various fields help to understand if the data are homogeneous or correlated with considering dropping some fields to make the datasets optimal for the analysis. Primary details like data type, shape, and basic statistical details such as percentile, mean, and standard deviation were studied. Error percentage boxplot was used to find the outliers for the corresponding features in the components by the length of the whisker's plots.

  • Data Processing—Data were prepared after analysis, and it was noticed that some of the measured values were over the range. These data were considered outliers and dropped. Feature scaling prevented the machine learning algorithm from assigning larger weights to higher values intuitively. The value of independent variables was rescaled within a preset range of 0 to 1. Of the two predominant techniques, min–max and standardization, the min–max technique performed normalisation. The complete dataset was separated into the training and test dataset in a ratio of 7:3.

  • Baseline Modelling, Solution Modelling, and Refinement—The standard was set for testing machine learning models by employing a baseline linear regression model, and improved regression models like Lasso, Ridge, random forest, and Extreme Gradient Boost (XGBoost) were built. Then, using the voting regression model, an ensemble meta-estimator, various base regressors were fit to create a final prediction by averaging the individual predictions. Optimised solution model was finally built by using hyperparameters to improve the performance.

  • Model evaluation and validation—There are ready libraries available to instantly leverage complex model components at ease, with minor modifications, to suit the requirement. However, preparing the datasets to fulfil the needs, running the model, and interpreting the results further iterating/refining the models rely on the choice of algorithm, the computational power of the machine used, and the user's expertise. The outcomes were interpreted and evaluated by determining accuracy, root-mean-square error, Pearson's coefficients, and intercept.

Considering the data to be linear, the first preference would be to apply the linear regression technique for its comprehensiveness and simple application, using the Train dataset. The model was then run on the Test dataset to compare the prediction vs actuals to summarise. The following regression Eq. (1) depicted how well a function relates input parameters to output parameters

$$Y \, = \, a1 \, X \, + \, a2 \, T \, + \, a3 \, L \, + \, a4 \, S \, + \, b$$
(1)

where a1, a2, a3, and a4 are the regression coefficients, and b is the intercept.

The built algorithm solves for the data set with the input parameters, design dimension of the features (X), nozzle temperature (T), layer thickness (L), and nozzle speed (S) used during the printing of the benchmarking component. These input parameters are related to the output parameter, i.e. measured dimensions of the features (Y). Linear regression, LASSO, Ridge, random forest, XGBoost, and voting regression were used to create models and compare performance metrics (accuracy, Pearson coefficient, intercept) to find the optimal predicted dependent variable.

3 Results and Discussion

Table 1 is used to segregate and group the values based on the features, and the summarised CMM measurements are tabulated as shown in Table 3.

Table 3 Summarized CMM measurement of features listed in Table 1

The deviation along the X-axis (Length) ranges from 0.05 to 0.19 mm, along the Y-axis (Width), it ranges from 0.005 to 0.23 mm, and along the Z-axis (Height), the deviation is about 0.007–0.43 mm. Due to the incremental movement of the stepper motor or the shrinkage in the ABS material, the irregularity in the dimensions occurs along the three axes [12]. In the X–Y axes, maximum deviation occurs for the largest designed dimension of the benchmark, which could be attributed to the warpage and shrinkage of the ABS. The lateral features contribute to the maximum dimensional difference in the Z-axis due to swaging in unsupported areas. Thin walls lesser than or equal to 3 mm resulted in increased size.

ML code was executed individually for each of the prediction parameters, and the regression equations are established as shown in Table 4.

Table 4 Regression equation tabulation

The predictive model's performance was evaluated using RMSE (Root-mean-square-error) and R2 (R-squared). The prediction models are trained on 70% of the entire data and evaluated on the remaining data (30%). From the result (Table 4), the accuracy of the predicted parameters is above 97%, and the RMSE of the predicted parameters is as shown in Fig. 3, which indicates that the predicted value is close to the actual value (Fig. 4).

Fig. 3
figure 3

R-squared and RMSE value of the predicted parameters

Fig. 4
figure 4

Actual data VS predicted data

With the above equations (Table 4), output dimensions can be predicted without printing the components based on the past data collected. From Fig. 3, we can conclude that the linear regression model give a better-predicted value.

4 Conclusion

A benchmark component consisting of various geometric shapes such as bosses (Square, rectangular, cylindrical, concentric cylindrical), holes, inclines, and staircase (positive, negative) was designed. A set of benchmark components were created with varied process settings on the FDM machine. ML techniques were utilised to estimate the dimensional deviation and tolerance of the geometrical features based on the dimensions of these fabricated features.

  • Most of the measured dimensions were within 0.3 mm of the nominal dimensions.

  • The deviation along the X-axis and Y-axis ranged about 0.05–0.19 mm, and 0.005–0.23 mm, respectively.

  • Unsupported geometric features on the lateral faces contributed to deviation along the Z-axis.

  • A trained ML model could predict the dimension and deviations of a given geometric feature with an accuracy of + 97% without printing the components.

The future scope of this work extends to accommodate more parameters and surface conformity through computer vision.