Introduction and Purpose

In the manufacturing industry, machine learning is revolutionizing numerous facets of production and product development. One essential area of application is quality control (QC). Over the last century, the ability to detect process shifts has relied heavily on control charts. These charts have provided valuable insights into deviations from standard processes, enabling timely interventions to maintain product quality. With the advent of machine learning algorithms modern sensor data can now be analyzed in real-time, leveraging machine learning techniques to identify process shifts swiftly and accurately. Monitoring this data can detect even subtle deviations from expected norms ensuring good quality products. This effort not only enhances product quality but also reduces waste and the need for rework, ultimately contributing to improved efficiency and cost-effectiveness in manufacturing operations.

Refractory coatings are utilized in the foundry industry to form a barrier between the molten metal and the core or mold surface. The use of these coatings can also lead to improved castings' surface quality by producing smoother casting surfaces and reducing thermal expansion defects. Monitoring coating thickness is a key aspect of QC as it helps in maintaining process consistency. This ensures that every casting produced meets the same quality standard. This information can also be used to optimize coating application techniques for better coverage, more uniform thickness, and improved efficiency.1 Coating thickness can be monitored with the help of machine learning. By utilizing vectorized principal component analysis (VPCA), the most relevant features are extracted from the multi-way data. This data is then used in a machine learning model to classify the chemically bonded sand specimen based on their coating thickness.

Thermal distortion is the expansion, contraction, and degradation experienced by a mold or core under extreme heat and liquid pressure of the molten metal.2 This behavior of the chemically bonded sand system is replicated in the laboratory with the help of a thermal distortion tester (TDT). Apart from monitoring the thermo-mechanical properties of green sand for QC, the TDT can also be used to monitor coating thickness and demonstrate the importance of going beyond traditional features.

Overall, machine learning's integration in the manufacturing environment for data generated from already existing QC test equipment empowers manufacturers to make data-driven decisions which helps in optimizing processes that were not possible before in the industry.

Literature Review

Chemically bonded sand is widely used in metalcasting industry such as the manufacturing of powertrains in the U.S. automobile sector and certain aerospace components. Although sand casting is the most often used method, accounting for more than 70% of applications, chemically bonded sand molds are becoming increasingly popular due to their suitability for certain requirements.3 This sand system utilizes chemical binders to strengthen molds and cores. The casting industry faces significant obstacles associated with imperfections in castings manufactured from chemically bonded sand cores and molds. Moreover, there are differences that affect the casting process parameters, such as the duration of work, the time it takes to remove the strip, the temperature at which the metal is poured, and the pressure exerted by the metallostatic system.

Given the industry's focus on near-net-shaped castings, it is crucial to employ advanced QC methods to effectively manage the many sources of variation.4 This section offers a thorough examination of the progressions related to chemically bonded sand in metalcasting, focusing on important factors such as refractory coatings, impact of its thickness on QC, and the use of machine learning for monitoring it.

Chemically Bonded Sand Systems

In the domain of metalcasting technology, cores and molds formed from chemically bonded sand represent a pivotal component, prompting extensive focus on comprehending their interaction with metal, particularly at the interface between the mold and metal. Currently, the metalcasting industry places paramount importance on manufacturing near-net-shape and thin-wall castings, while simultaneously striving to meet increasingly rigorous requirements for dimensional reproducibility.

Chemical binders play a crucial role in the creation of precision sand molds and cores, serving as the primary technology for manufacturing powertrains in the U.S. automotive sector and specific aerospace components. Despite its prominence, the casting industry grapples with significant challenges related to defects in castings originating from chemically bonded sand cores and molds. The decline in quality can be attributed to various factors, including fluctuations in materials like grain size, grain shape, chemical composition, binder concentration, and additives. To address diverse casting requirements, the foundry industry employs a range of mold-making techniques tailored to specific needs while utilizing chemically bonded sand.5

Introduced in the early 1960s, the cold box core process revolutionized manufacturing with its ability to achieve superior compaction, facilitate intricate core designs, and maintain precise dimensional accuracy. Cold box molding is a process where a blend of sand and a curing chemical is injected into a core box at ambient temperature. No-bake molding involves the combination of sand and a liquid resin binder to form a mold or core. This mixture is then allowed to solidify without the need for additional heat. Despite its many advantages, porosity remains a significant drawback in castings produced using this method.6

Hot box molding utilizes a thermosetting resin and curing agent combined with heated core boxes to accelerate the curing process.7 The advent of additive manufacturing has introduced the innovative technique of 3D-printed molding, which allows for the intricate shaping of detailed forms by layering sand-like material. Injection molding is a process that involves injecting molten material into a mold cavity, where it then solidifies. The selection of each method is determined by considerations such as the complexity of the part, the level of precision needed, and the material requirements. This demonstrates the wide variety of mold-making techniques available in the industry. Furthermore, variations manifest in casting process parameters like worktime, strip time, pouring temperature, and metallostatic pressure. As the industry increasingly emphasizes the production of near-net-shaped castings, the development of advanced QC approaches becomes imperative to address the diverse range of variation sources.5

In terms of mold materials, three main types exist: metal dies, sand molds, and ceramic molds. Sand casting, utilizing sand molds, dominates the landscape, accounting for over 70% of all metal castings. Most sand molds and cores are crafted from silica sand, chosen for its widespread availability as a molding component. Silica sand presents advantages such as lower tooling costs, versatility with various metals, and fewer restrictions in part geometry, particularly when contrasted with permanent mold processes.4,8

Manual Ramming

Manual ramming entails the manual packing of chemically bonded sand around a pattern to shape the mold. This technique commonly employs a two-part or three-part sand system, where sand is blended with a chemical binder before being manually compacted around the pattern. Among the binders utilized, resin types like phenolic urethane are prevalent, imparting the sand mixture with essential strength and rigidity upon curing. Despite its efficacy, the manual ramming process demands skilled labor for achieving the desired density and uniformity in the mold.

Core Blowing

Core blowing is a more automated method used to create complex internal shapes within molds. In this process, chemically bonded sand is blown into a core box (the mold used to create the mold's internal cavities) under high pressure. Once the sand fills the core box, a gas (such as carbon dioxide or sulfur dioxide) is passed through the sand, catalyzing the chemical binder and causing it to harden. This method allows for the production of intricate cores with high precision and repeatability. In the core blowing process used for no-bake cores, it is common practice not to recycle the sand in the discharge magazine for repeated blowing, unlike in processes such as cold box or others. Once the blowing is completed, the remaining sand is typically discharged and not reused in the process.9

Gas-Cured Systems

Gas-cured systems represent a specialized category within chemically bonded sand molding, wherein a gas is employed to catalyze the hardening or curing of the binder in the sand mixture. Typically utilized in conjunction with core blowing techniques, gas curing can also be adapted for mold-making processes. Following the placement of sand in the mold or core box, a gaseous catalyst is introduced, swiftly initiating the curing reaction and solidifying the sand. This approach boasts notable efficiency and enables rapid production cycles, as the curing process is notably expedited compared to conventional air drying or baking methods.9

Advantages of Chemically Bonded Molds

Chemically bonded sand molds present distinct benefits compared to conventional green sand molds. These advantages include enhanced precision, accelerated production times, and increased versatility. The inherent rigidity and stability of chemically bonded molds enable the attainment of tighter tolerances and finer details in the resulting castings, thereby ensuring superior dimensional accuracy. Additionally, the adaptability of chemically bonded molds across a diverse array of metals and casting designs underscores their suitability for producing complex and large-scale components, addressing varied industrial requirements. These collective advantages underscore the significance of chemically bonded sand molds as a preferred choice for achieving precision, efficiency, and versatility in modern foundry operations.10

Refractory Coating

The significance of foundry coating in enhancing casting surface quality remains paramount in foundry operations. Applying mold and core washes establishes a robust thermal barrier between the metal and the mold, mitigating thermal shock experienced by the sand system. Such shock often results in surface defects like veining/finning, metal penetration, burn-on/in, scab, rat tail, and erosion. The utilization of coatings effectively minimizes the likelihood of these defects. Foundry coatings are indispensable for achieving high-quality surface finishes in castings, particularly intricate internal channels created by cores, despite notable advancements in binder and sand technology. While sand particle grading plays a crucial role in casting surface finish, other factors such as gas venting capability, binder economy, and sand availability with requisite grading necessitate the practical use of coatings in foundry processes.11 Filling a mold with liquid metal subjects its surface to thermal, mechanical, and physicochemical forces. Metal oxidation reacts with mold materials, forming low-melting substances like silicates which improves quartz sand grain lubrication. This facilitates metal penetration into intergranular spaces, causing stubborn mechanical pick-up on casting surfaces.

Due to high mold and core porosity, defect-free castings require protecting surfaces with refractory coatings. Essential coating qualities include minimal porosity, high refractoriness, and mitigation of physicochemical reactions at the metal-coating interface, including lubrication, solution, and penetration. Refractory coatings serve dual purposes: enhancing casting quality and reducing costs. They improve surface quality by creating smoother metal surfaces, achieved by filling the spaces between sand grains or providing a smoother surface to the metal than the mold itself. Additionally, coatings facilitate cleaner sand peeling at shakeout, leading to improved surface finish and elimination of defects like metal penetration, veining, erosion, and sand burn-in. These benefits contribute to overall quality enhancement and cost reduction in casting production processes.12

According to Nwaogu and Tiedje, a refractory coating applied to molds or cores should possess specific attributes for optimal performance which include sufficient refractoriness to endure the poured metal, strong adhesion to prevent spalling, permeability to minimize air entrapment, rapid drying capacity, resistance to blistering, cracking, or scaling during drying, effective suspension and remixing properties, limited degradation of core strength, adequate safeguard against metal penetration, stability during storage, high coverage capability, suitable application properties for the chosen method, and even leveling to reduce runs and tear drops. To achieve these characteristics, the coating typically comprises refractory filler, liquid carrier, rheology-controlling suspension agents, binder agents, and additives.11

Thermal Distortion Tester

A variety of tests speak about the mechanical behavior of the sand system like hot friability test, modified cone jolt test, thermal erosion test and thermal distortion test (TDT) to name a few. TDT is used to help analyze the thermo-mechanical properties of chemically bonded sand system behavior through the multivariate time series data it generates. Apart from thermo-mechanical properties, the time series data can be used to monitor the refractory coating thickness without having to perform the traditional destructive test of fracturing the specimen and measuring the coating thickness using an optical equipment. The application of directional heating to sand composites, encompassing both mold and core media, induces anisotropic thermal gradients within the materials. Upon contact with molten metal, the transfer of heat from the metal to the sand initiates thermo-chemical reactions, culminating in distortion. To be precise, the binder undergoes thermally induced reactions concurrently with sand expansion and/or plastic deformation, resulting in substantial distortions within the sand core or mold.13,14

In specific chemically bonded systems like organics, reactions typically involve the release of volatile materials. These reactions may encompass potential enhancements in core strength through secondary curing, but they can also lead to core weakening via pyrolysis. It is crucial to acknowledge that when analyzing thermal distortion data, distortions may arise from both the binder and the aggregate.14

The thermal distortion test (TDT), employing a disk-shaped specimen, proves to be an effective method for assessing the thermo-mechanical characteristics, particularly distortions, within chemically bonded sand systems. The TDT apparatus offers versatility with adjustable temperatures, allowing for the replication of mold–metal interfacial temperatures tailored to specific alloys, such as 700°C (1292°F) for aluminum or 1200°C (2192°F) for cast iron. Furthermore, varying loading pressures can be exerted on the specimen to emulate distinct metallostatic pressures acting on the core/mold material.5

The data acquisition system of TDT captures a multivariate time series data of axial displacement, radial displacement, heating element temperature, head pressure, and backside temperature of the specimen. The data is captured at a frequency of 10 observations per second and the test runs for a total of 90 seconds giving 900 observations per feature. A cross section of TDT is shown in Figure 1.

Figure 1
figure 1

TDT cross-section.

Vectorized Principal Component Analysis

Recent literature has shown a growing emphasis on optimizing principal component analysis (PCA) algorithms by adopting vectorized implementations, capitalizing on the capabilities of numerical computing libraries. VPCA relies on array-based operations, where mathematical operations are applied to entire matrices or arrays, rather than individual elements. This approach effectively utilizes parallel processing in modern computing architectures, leading to significantly faster computations. The reduction in explicit loops not only simplifies the code but also improves the scalability of PCA algorithms.

Different multi-linear extensions of the PCA have been proposed in literature: Some of them are limited to the case of 2D data (and are especially used in image analysis), like 2D-PCA or the generalized PCA, whereas some other extensions may be applied to tensors of any order.15,16,17

Multi-way data analysis is the extension of two-way methods to higher-order datasets.18 A two-way dataset may be represented in terms of a N×P matrix, where N is the number of samples and P is the number of variables: In this frame, the PCA is a well understood and used multivariate technique to explain the variance–covariance structure through a few linear combinations of the original variables.17 One possible approach to deal with multi-way arrays involves the ‘matricization’ operation,19 which consists of unfolding the multi-dimensional dataset into a bi-dimensional one. Vectorized principal component analysis (VPCA) was first introduced by Nomikos and MacGregor in 1995 for monitoring batch processes. This technique involves the unfolding of a three-way array to a two-dimensional matrix where each row in the matrix represents the vector for an observation.20

Regular principal component analysis (PCA) is then employed on this vector data for feature extraction and analysis of one-dimensional data arrays through a low-rank decomposition strategy. Essentially, PCA seeks to reduce the dimensionality of a complex set of interconnected variables by transforming them linearly into a new set called principal components (PCs). These components are uncorrelated and strategically arranged to preserve most of the original data variation in the initial components.17 VPCA is explained in further detail in the methodology section.

Machine Learning Model

Supervised classification stands out as a widely performed task within Intelligent Systems, leading to the development of numerous techniques rooted in both Artificial Intelligence (such as Logic-based and Perceptron-based techniques) and Statistics (including Bayesian Networks and Instance-based techniques). The primary objective of supervised learning is to construct a succinct model that characterizes the distribution of class labels based on predictor features. Subsequently, the generated classifier is employed to assign class labels to testing instances, where the predictor feature values are known, yet the class label remains unknown based on predictor features. Subsequently, the generated classifier is employed to assign class labels to testing instances, where the predictor feature values are known, yet the class label remains unknown.21

There are numerous applications for machine learning (ML), with predictive data mining being one of the most significant. In ML, each instance within a dataset is uniformly represented using a set of features, which can be continuous, categorical, or binary. The learning process is categorized as supervised when instances are provided with known labels (corresponding correct outputs), in contrast to unsupervised learning, where instances lack labels.22

Machine learning models have become indispensable tools across diverse domains, driving advancements in data-driven decision-making. In this context, an overview of some of the most common machine learning models is provided before delving into our model of choice, which is Support Vector Machines (SVMs).

  1. 1.

    Random Forest

    Random Forest, a popular ensemble learning technique introduced by Breiman, has demonstrated robustness and versatility in various applications. By constructing multiple decision trees and aggregating their outputs, Random Forest excels in tasks such as classification and regression. Its ability to handle large datasets and mitigate overfitting makes it widely adopted in predictive modeling.23

  2. 2.

    Decision Trees

    Decision Trees, a foundational concept in machine learning, remain relevant due to their interpretability and simplicity. Quinlan's work on the C4.5 algorithm significantly contributed to decision tree modeling. Decision Trees are widely employed in classification tasks, data exploration, and feature selection due to their intuitive representation of decision-making processes.24

  3. 3.

    K-Nearest Neighbors (KNN)

    KNN, a simple and intuitive algorithm, has proven effective in both classification and regression tasks. Cover and Hart's seminal work introduced the KNN algorithm, which relies on proximity-based decision-making. KNN is valuable for its simplicity, ease of implementation, and adaptability to various types of data.25

  4. 4.

    Logistic Regression

    Logistic Regression, despite its name, is a widely used model for binary and multiclass classification. Applied in fields such as medical research and social sciences, logistic regression provides a probabilistic framework for decision-making. Hosmer and Lemeshow's work offers an in-depth exploration of logistic regression's applications and statistical underpinnings.26

  5. 5.

    Support Vector Machines (SVM)

    SVMs are classified as a supervised machine learning technique, focusing on the concept of a "margin"—the region on either side of a hyperplane that separates two data classes. The key principle behind SVMs involves maximizing this margin, aiming to create the widest possible gap between the separating hyperplane and the instances on both sides. This maximization has been demonstrated to reduce an upper bound on the expected generalization error.

In the context of linearly separable data, once the optimal separating hyperplane is identified, data points lying on its margin are termed support vector points. The solution is then represented as a linear combination of only these support vector points, disregarding other data points. Consequently, the model complexity of an SVM remains unaffected by the number of features encountered in the training data. Typically, the SVM learning algorithm selects a small number of support vectors. This characteristic makes SVMs well-suited for learning tasks where the number of features is large in comparison with the number of training instances.

While the maximum margin concept allows Support Vector Machines (SVMs) to choose from various candidate hyperplanes, certain datasets may pose challenges, leading the SVM to be unable to identify any separating hyperplane. This situation often arises when the data includes misclassified instances. To address this issue, a soft margin approach is employed, permitting some degree of misclassification among the training instances.27 In the context of this paper, an exhaustive grid search was conducted to fine-tune the SVM model's hyperparameters. To identify the most suitable Support Vector Machine (SVM) model for the classification task, three distinct strategies were employed: one vs one, one vs rest, and all vs all of which one vs rest provided the highest accuracy. These approaches aimed to leverage the SVM's ability to handle multi-class classification scenarios effectively.

Methodology

This section of paper will showcase different classification modeling techniques of refractory coatings based on their thickness. The classification will be based on features extracted using VPCA as opposed to using scalar properties of the disk-shaped specimens. The ability to classify the specimen based on their coating thickness enables monitoring the coating of the mold and core. Additionally, destructive testing is why most foundries do little coating thickness testing on a routine basis and that is why this approach of determining coating thickness is important.

Experimental Setup

Data Preprocessing

Data was collected for four different zircon-coated specimens namely – \(1\times 100, 2\times 200, 3\times 300\), and specimen with ‘no coating’. The specimen was dipped for a specific amount of time to achieve the desired thicknesses. The amount of time was decided by dipping test specimens made of round grain silica, with furan binder. These cookies were developed using 3D printing. The nomenclature of the specimen with coating thickness is such that the first number indicates the coating thickness in mm below the surface of the specimen and the second number indicates the coating thickness in microns above the surface of the specimen. The time series data for radial and axial displacement was plotted and the specimen which displayed a sudden drop in radial or axial displacement indicated fracturing of the specimen during the TDT. These specimens were excluded from further analysis to ensure homogeneity of the dataset. The collective plotting of average axial, radial, and temperature attributes for all four coatings thicknesses served as a crucial step in discerning potential patterns or orders within the dataset as shown in Figure 2 axial, radial, and backside temperature time series plots. This graphical analysis aims to unveil any sequential arrangements in the curves associated with each coating type. If an order does become apparent in at least one of the graphics, then further in-depth analysis may be deemed unnecessary, as the initial exploration has provided classification for each coating type.

Figure 2
figure 2

Combined plots for (a) Axial, (b) Radial, and (c) Backside temperature.

The next step in data preprocessing involves normalizing the data to ensure that the axial and radial displacement and backside temperature features are on the same scale.

Feature Extraction Using VPCA

The normalized data is then used as an input for feature extraction using VPCA. The VPCA technique as described by Nomikos and MacGregor 1995 is used here. Instead of a three-way array, our data is already unfolded state and can be represented as a two-way array. In this array, 900 features each of axial, radial, and backside temperature for a specimen are set up as a 1-D array which is a vector. This results in a \(1\times 2700\) vector for each specimen. After excluding the failed specimen during data preprocessing, a total of 38 specimen of all 4 coating thicknesses were obtained resulting in an input matrix M of dimension \(38\times 2700\) expressed below:

$$ {\text{M}} = \left[ {\begin{array}{*{20}c} {x_{11} } & {x_{12} } & {...} & {x_{1,2700} } \\ {x_{21} } & {x_{22} } & {...} & {x_{2,2700} } \\ . & . & {...} & . \\ . & . & {...} & . \\ {x_{38,1} } & {x_{38,2} } & {...} & {x_{38,2700} } \\ \end{array} } \right] $$
(1)

The next steps involved in VPCA are similar to that of a traditional PCA and are explained below:

Step 1 Centering the data

The data is centered by subtracting the mean of each feature \({\overline{\text{X}}}\) from its observation X and is expressed as

$$ {\text{D}}i \, = {\text{ X}}i \, - {\overline{\text{X}}}i \quad {\text{ where }}i{\text{ is the number of feature types}}. $$
(2)

Step 2 Calculating the covariance matrix of the centered data (C)

The covariance between two features i and j in a dataset can be calculated using the following formula:

$$ \begin{array}{*{20}c} {{\text{cov}}\left( {X_{i} ,X_{j} } \right) = \frac{1}{m}\mathop \sum \limits_{k = 1}^{m} \left( {X_{ki} - \overline{{X_{i} }} } \right) \cdot \left( {X_{kj} - \overline{{X_{j} }} } \right)} \\ \end{array} $$
(3)

Here:

\({X}_{i} \,and\, {X}_{j}\) are the \({i}^{\text{th}}\) and \({j}^{\text{th}}\) feature, respectively.

m is the number of samples in the dataset

\({X}_{ki}\, and \,{X}_{kj}\) are the values of the \({i}^{\text{th}}\) and \({j}^{\text{th}}\) features for the \({k}^{\text{th}}\) sample.

\(\overline{{X }_{i}}\, and\, \overline{{X }_{j}}\) are the means of the \({i}^{\text{th}}\) and \({j}^{\text{th}}\) feature, respectively.

The resultant covariance matrix (C) will be –

$$ \begin{array}{*{20}c} {\left[ {\begin{array}{*{20}c} {{\text{Var}}\left( {X_{1} ,X_{1} } \right)} & {{\text{Cov}}\left( {X_{1} ,X_{2} } \right)} & \ldots & {{\text{Cov}}\left( {X_{1} ,X_{2700} } \right)} \\ {{\text{Cov}}\left( {X_{2} ,X_{1} } \right)} & {{\text{Var}}\left( {X_{2} ,X_{2} } \right)} & \ldots & {x_{2,2700} } \\ . & . & {...} & . \\ . & . & {...} & . \\ {{\text{Cov}}\left( {X_{2700} ,X_{1} } \right)} & {{\text{Cov}}\left( {X_{2700} ,X_{2} } \right)} & \ldots & {{\text{Var}}\left( {X_{2700} ,X_{2700} } \right)} \\ \end{array} } \right]} \\ \end{array} $$
(4)

The covariance calculation of a feature with itself gives its variance causing the diagonal values of the covariance matrix (C) to be variance values.

Step 3 Computing the Eigenvalues and Eigenvectors

Let A be a square matrix, \(V\) a vector and λ a scalar that satisfies A \(V\) = λ \(V\), then λ is called eigenvalue associated with eigenvector \(V\) of A.

The eigenvalues \({\lambda }_{1}, {\lambda }_{2} ....{\lambda }_{2700}\) are roots to the characteristic equation

$$ \begin{array}{*{20}c} {{\text{Det}}\left( {A - \lambda I} \right) = 0} \\ \end{array} $$
(5)

where I is an identity matrix.

The eigenvectors are calculated by substituting the \(i^{{{\text{th}}}}\) eigenvalues (\(\lambda_{i} )\) into the equation below to find the corresponding \(i^{{{\text{th}}}}\) eigenvector \(V_{i}\)

$$ \begin{array}{*{20}c} {\left( {A - \lambda iI} \right) \cdot V_{i} = 0} \\ \end{array} $$
(6)

Step 4 Sorting the Eigenvalues and Eigenvectors

The eigenvalues λ and its corresponding eigenvectors \(V\) are arranged in descending order and suppose the top 5 eigenvectors were chosen as they contributed to 97% of the total variance, the resultant shape of V matrix will be \(2700\times 5\). This V matrix will be denoted as \({V}_{\text{top}}\) and the shape of λ matrix will be \(5\times 5\) where the diagonal elements of the matrix are the eigenvalues.

Step 5 Projection of centered data to reduced dimension

The centered data D is then projected into the reduced dimension by multiplying it with \({V}_{\text{top}}\). The resultant matrix is a \(38\times 5\) matrix. This matrix now has the features for each specimen in the reduced dimension and can be used for further analysis.

Feature Extraction Using VPCA

The classification of the coatings based on their thickness can be implemented using an array of algorithms, including decision trees, neural networks, and KNN. When employing standard supervised learning techniques, it is essential to select the optimal performance metrics, like accuracy.

It is important to note that our training set incorporates different types of features explained in the forthcoming scenarios. To gauge the effectiveness of the proposed method against other benchmark models, a multi-class confusion matrix is utilized. Table 1 illustrates the multi-class classification metrics and formulas.

Table 1 Multi-Class Classification Metrics and Formulas

This matrix represents the actual vs. predicted classifications for every class. Key metrics that are calculated for each class include True positive (TP), True negative (TN), False positive (FP), and False negative (FN). As an illustration, TP represents the count of images from a specific class that were accurately classified. This model's performance is assessed using prominent multi-class metrics, namely Accuracy and F-score. These metric formulas are derived from the work of Sokolova and Lapalme.28

The next section will discuss the different classification scenarios in detail.

Scenario A Coating classification using axial, radial, and backside temperature properties:

In this classification approach, the standardized data becomes the foundation which is used as an input to a machine learning algorithm. Standardization is performed by subtracting the mean of each feature from its values as explained in the first step of VPCA. This data is then used in the machine learning model to classify based on different coating thicknesses.

Scenario B Coating classification using features extracted from axial, radial, and backside temperature properties:

In this classification modeling, VPCA is applied on the standardized data from Scenario A to extract the principal components (features) which contribute to the maximum variance. It is important to note that the principal components which contributed to 95% of the total variance were used. These principal components were then used as an input to a machine learning algorithm.

Scenario C Coating classification using scalar properties:

In this approach, we took a distinctive approach by exclusively relying on the scalar features of the thermal distortion test data which include area under curve, Y-axis displacement and slope of the curve. The curve here is referred to the time series plots of axial and radial displacements and the backside temperature. These attributes directly reflect the inherent characteristics of the TDT which ideally should provide a more interpretable basis for classification.

Scenario D Coating classification using feature extracted from scalar properties:

In this research approach, we have focused solely on the features extracted using VPCA from the scalar properties mentioned in scenario C. The features were extracted the same way as mentioned in scenario B and the number of principal components retained here contributed to 97% of the total variance. This process enhances the interpretability and efficiency of the subsequent machine learning algorithm.

Scenario E Coating classification using features extracted by training on entire dataset:

This classification was performed by extracting features from axial, radial, and backside temperature data as mentioned in scenario B employing VPCA. The distinctive aspect of scenario E is the decision to forgo the traditional practice of splitting the dataset into training and testing sets. Instead, the machine learning model is trained on the entirety of the standardized dataset. This departure from the conventional approach serves a specific purpose: to establish an upper threshold for the maximum achievable accuracy. By training on the complete dataset, the model is expected to achieve its highest possible accuracy, providing a reference point against which the performance of other scenarios can be compared.

The data generated from all the above scenarios is used as an input to a machine learning algorithm. This data is first split into training and testing datasets except for in scenario E and a support vector machine (SVM) classifier model is built using the training data and applied on the testing data. The choice of this machine learning algorithm was made by building and applying the model on a variety of different machine learning models like Decision Tree, Random Forest, KNN, and Logistics Regression. SVM’s one vs rest strategy provided the highest accuracy.

An exhaustive grid search was performed to finetune the hyperparameters of SVM. A K-fold cross-validation technique is applied to reduce the risk of overfitting and provide a more reliable assessment of how well the model performs on unseen data. Consistency is maintained by employing the same machine learning algorithm with the same hyperparameters for all the scenarios.

Results and Discussion

This section will present the performance evaluation of all the different scenarios discussed earlier along with a Hotelling’s T-squared chart which detects a process shift when the coating thickness changes. The classification model is developed for each of the previously described scenarios. We aim to compare the effectiveness of these models in the context of coating classification based on their thickness. For each of the scenarios, the model is built and applied using the same datatype. The comparison will focus on the models' ability to accurately classify coating thickness without a priori preference for any methodology, offering insights into the strengths of feature extraction versus directly utilizing axial, radial, backside temperature or scalar properties.

Our process for all the models' development and validation was bifurcated into two stages, utilizing a dataset that was divided into a training set (comprising 85% of the data) and a testing set (the remaining 15%). During the initial stage, we engaged in the model development by employing VPCA to perform dimensionality reduction on our dataset in scenario B, D, and E. This technique was pivotal in extracting and preserving features responsible for 95% of the total data variance in scenario B and E and 95% of the total data variance in scenario E. By doing so, we ensured that the model encapsulated the critical attributes indicative of coating thickness, providing a robust foundation for accurate classification.

Subsequently, the second stage entailed a thorough evaluation of the developed model using the designated testing set. This step was instrumental in gauging the model's performance on data that was not previously exposed to during the training phase, simulating how the model would operate when confronted with new samples in a practical setting. The comparative analysis of all the five models was conducted under uniform conditions to maintain the integrity of the performance assessment.

The evaluation metrics including the accuracy (%) per label and \({F}_{\text{score}}\) per label results are shown in Tables 2 and 3, respectively. These metrics provide an in-depth look at the model's performance across different labels, offering insights into the precision and reliability of the classification results. This structured approach underscores the comprehensiveness and practicality of our proposed methodology for coating thickness classification.

Table 2 Accuracy (%) Per Label
Table 3 Fscore Per Label

Detection of Process Shift Using Hotelling’s T-Squared Statistics

This analysis delves into distinctions within a dataset comprising axial, radial, and temperature data. The PCA model is initially built using data from a portion of a class 1. Subsequently, this trained model is applied to the remainder of that class and its neighboring classes, projecting them onto a reduced subspace.

Within this reduced space, Mahalanobis distance, quantified as Hotelling's T-squared statistic (T2), is calculated for each observation. To assess the distribution of T2 values for each class, a scatter plot is utilized for visualization. This visual representation aids in understanding the variability within and between classes, offering insights into potential shifts in data patterns (Figure 3).

Figure 3
figure 3

Combined plots for Hotelling's T-squared statistic (T2) values showing differences in a Class 1 and 2 and b Class 0 and 1.

Conclusions and Future Work

Overall, this study has showcased a methodical strategy for monitoring the thickness of refractory coatings using classification techniques, offering useful information for quality control in foundry instead of physically measuring it. We have developed a robust framework that reliably categorize coating thicknesses and detect process shifts by utilizing features extracted using VPCA from axial, radial, and temperature information from the TDT. By utilizing Hotelling’s T-squared statistics, we were able to detect the process shift when the coating thickness changed. This has paved the way for better process monitoring and quality control.

The transition from classification to prediction in monitoring refractory coating thicknesses signifies a significant step toward aligning with the principles of Industry 4.0. It emphasizes the integration of digital technologies and data-driven processes to enhance manufacturing efficiency and flexibility. By leveraging advanced machine learning algorithms and real-time monitoring technologies, the proposed predictive models can transform foundry operations into highly responsive and adaptive systems.

Incorporating sensor data and continuous monitoring systems further enhances the predictive models, facilitating real-time insights into coating dynamics. This integration of data streams mirrors the interconnectedness and data-driven nature promoted by Industry 4.0, where the Industrial Internet of Things (IIoT) plays a central role in capturing and analyzing vast amounts of operational data.