Keywords

1 Introduction

Predictive maintenance (PdM) has been gaining prominence recently in multidisciplinary sectors, enabling the use of a maintenance policy based on novel Machine Learning (ML) algorithms. In essence, PdM works by estimating and foreseeing failures in deteriorating systems around manufacturing environments, in order to optimize maintenance efforts [1].

The printing industry is one of the largest manufacturing industries in the world, having high production volumes, where continuous maintenance of machine performance is key. Possible breakdown events will automatically result in production stop, disturbing thus not only the production process, but also burdening financially the manufacturers. Offset Printing enables the production of large quantities, as the variable production costs are deemed small compared to the setup costs, thus having a greater risk in case of machine breakdown. Possible failures found in Offset Printing, include but are not limited to: (i) defective offset rubbers, (ii) wear of ink rollers, (iii) incorrect bending or damaged printing plates, (iv) insufficient pressures on the printing machines, (v) non-conformity issues in the sheet delivery unit, and (vi) random failures, which are found at every manufacturing environment [2].

According to Haarman et al. [3], maintenance procedures are shown to represent a total of 15–60% out of the total costs of operating of all manufacturing, thus showcasing the importance of a PdM solution. In detail, a PdM solution aims to not only prevent possible failures but to also optimize operations, affecting thus different aspects of manufacturing, including safety, product quality, reliability, and minimization of operational costs.

PdM data provide valuable insights for both diagnostics and prognostics information, enabling maintenance work to become proactive. ML assumes that data used for training and testing purposes are under the same feature space, having similar distribution and comparable proportion of training instances belonging to each class. However, this is not always the case in real world applications, where ML have to face complex challenges in which these assumptions are not always satisfied [4].

Furthermore, ML models that are subjected to imbalanced datasets, are prone to be highly biased while having misleading accuracy scores. This phenomenon can be attributed due to the lack of information coming from the minority class of a given dataset and to ML models in general, as they tend to classify every test sample into the majority class, in order to improve the accuracy metric [5, 6].

This phenomenon is predominated in cases where anomaly detection is of prime importance, such as in PdM, prohibiting the systems to accurately predict machine failure, leading not only to excessive costs for the manufacturers, but also possibly affecting the safety of the workers.

To mitigate this issue, sampling techniques such as under-sampling and oversampling are used either to create more instances of the minority class to increase its population or to minimize the data instances found on the majority class.

In this paper, the occurrence of machine failure is determined on a predictive maintenance dataset, implementing SMOTE, ADASYN and RUS methods to generate balanced datasets of machine failure instanced found in Offset Printing. The efficiency of the proposed oversampling and undersampling methodologies are analyzed with the help of various machine learning classifiers, with the aim to improve predictive maintenance accuracy scores.

This paper is structured as follows. In Sect. 2, we suggest the details of the utilized dataset and of the proposed methodology of handling imbalanced datasets, alongside with the classification algorithms. In Sect. 3, the experimental results used to assess the performance of the different sampling methods and of the classification models, are presented. Finally, in Sect. 4 the results are summarized and discussed.

2 Materials and Methods

2.1 Dataset Description

The original dataset consisted of features and labels based on historical measurements collected during a 4-month trial period (03/07/2022–31/10/2022) from Pressious Arvanitidis, an Offset Printing manufacturer based in Greece. Each of the collected parameters and features, follows the process of a particular printing order (i.e., from the sales department to the quality assessment department). The order and factory related characteristics used in this paper are presented in Table 1.

Table 1 Parameters used for the training and testing procedures for the ML models

Table 2 summarized the descriptive statistics of the independent and dependent variables of the complete dataset.

Table 2 Parameters and attributes of the input and target variables

2.2 Data Processing Methodology

Due to the high-class imbalance in the initial raw dataset regarding the failure events (containing only 145 events of some type of machine failure out of the 4205 total printing runs), data preprocessing was performed to facilitate the training and testing processes of the ML models with high-quality data.

Particularly, to avoid a scenario where a particular variance dominates the objective function of the learning algorithms (making it unable to learn from other features correctly as expected), data scaling was performed initially, using the Log Transformation methodology. The method was used as it enables data measurements to become more symmetric to a normal distribution. After the scaling the dataset was divided into a training set (80%) and test set (20%) (Step 1).

Furthermore, high data dimensionality has shown to have a direct effect on classification accuracy, increasing the rate of misclassification and thus reducing the overall accuracy of a classification algorithm. Therefore, dimensionality reduction was also performed using Principal Component Analysis (PCA). Specifically, the dataset PCA enables the conversion of correlated features found in the high dimensional space into a series of uncorrelated features in the low dimensional space, that depict the linear combination of existing variables, and for that reason it has become a necessity before applying any data sampling approach [7] (Step 2).

To effectively deal with the class imbalance, three different sampling techniques were employed, namely, Random UnderSampling (RUS) [8], Synthetic Minority Oversampling Technique (SMOTE) [9] and the Adaptive Synthetic sampling approach (ADASYN) [10] (Step 3).

These techniques operate in a feature space aiming either to under-sample the majority class data or oversample the minority one. On the one hand, under-sampling techniques such as RUS, are used to improve imbalance levels of the classes to the desired target, by reducing the number of majority instances. However, the removal of instances from the majority class is performed without replacement, meaning that useful information might be permanently lost. In addition, due to the randomized nature of RUS, an unclear decision boundary may be resulted, affecting classifiers performance [11].

On the other hand, over-sampling approaches intent to improve imbalance levels of the classes to the desired target, by generating synthetic instances and adding them to the minority class. Unlike approaches such as random oversampling, SMOTE generates artificial instances in the minority class, based on the feature space, rather than the data space, considering linear combinations between existing minority samples. Moreover, derived from SMOTE, the ADASYN approach gives different weights to different minority samples of a given dataset, while it automatically determines the number of samples required to produce in order to achieve data balance.

The aforementioned methodology is depicted in Fig. 1.

Fig. 1
A flow diagram of the proposed methodology. Raw data from sensors and printing machines, and historical data from the shop floor are scaled. Split data, dimensionality reduction by P C A and data sampling lead to an unbalanced dataset followed by a balanced dataset and classification algorithm.

Proposed methodology for imbalanced data handling in predictive maintenance

2.3 Machine Learning Models

To create the proposed framework, stratified 5-fold cross validation was used for all the experiments in this study. The base ML models were trained, using the scikit learn package [12], including:

  • Logistic regression (LR) is a standard probabilistic statistical classification model that has been extensively used for classification problems across disciplines. Different from linear regression, logistic regression analyzes the relationship between multiple independent features and estimates the probability of occurrence of an event, by fitting the data onto a logistic curve. LR is affected by outliers, which greatly skews parameter estimation, reducing classification performance [13].

  • k-Nearest Neighbors (kNN) [14] enables a low-power computational classification through the identification of the nearest neighbors given by a query example and using those neighbors to determine the class of the query [15].

  • Decision Tree (DT) is a learner which repeatedly splits the dataset according to a cost criterion that maximizes the separation of the data, resulting in tree-like branches. In detail, the algorithm attempts to select the most important features to split branches and iterate through a given feature space. Compared with the other machine learning methods, decision trees have the key advantage, that are not characterized as black-box models and can be easily expressed as rules [16].

  • Random Forest (RF) algorithms fall under the broad umbrella of ensemble learning methods. The key principle underlying the algorithm is the decision tree. Specifically, every data instance is initially classified by every individual DT, and then classified by a consensus among the individual DTs. The diversity among these individual DTs can thus further improve the overall classification performance, and so bagging is introduced to promote diversity. The advantages of using RF include its robustness to overfitting and its stability in the presence of outliers [17].

2.4 Evaluation

To compare the performance of the candidate models, the most frequently used metrics for classification are utilized, including accuracy (ACC), precision (P), recall (R), and F1-score values, calculated as [18]:

$$ {\text{ACC }} = \, \left( {{\text{TP }} + {\text{ TN}}} \right)/\left( {{\text{TP }} + {\text{ FN }} + {\text{ TN }} + {\text{ FP}}} \right) $$
(1)
$$ P \, = {\text{TP}}/\left( {{\text{TP }} + {\text{ FP}}} \right) $$
(2)
$$ R \, = {\text{TP}}/\left( {{\text{TP }} + {\text{ FN}}} \right) $$
(3)
$$ F1 \, = \, (2 \, \cdot {\text{Precision }} \cdot {\text{ Recall}})/\left( {{\text{Precision }} + {\text{ Recall}}} \right) $$
(4)

3 Results

Table 3 showcases the overall performance comparison between the different classification algorithms, each utilizing different sampling methodologies with the aim of achieving higher accuracy and f1-scores, for more accurate classification of machine failures in the field of Offset Printing.

Table 3 Overall performance evaluation of classification algorithms under different sampling methodologies

In detail, the experiment results demonstrate that both Random Forest and Decision Trees algorithms performed significantly better than the rest of the base models, while Logistic Regression performed the least accurate scores. Moreover, both SMOTE and ADASYN sampling methods, showed to improve classification accuracy throughout the models, while Under-sampling had the least effect on improving classification accuracy.

Furthermore, as showcased in Fig. 2, the implementation of SMOTE and ADASYN indicated similar results, with models under SMOTE slightly outperforming the rest of the methods using ADASYN.

Fig. 2
A grouped bar graph of scores versus 4 models. The SMOTE data sampling method under the random forest algorithm has the highest values with an accuracy of 0.882726, precision of 0.930196, recall of 0.882726, and F 1 score of 0.905511. Under sampling has the lowest values in each model.

Performance comparison of the proposed algorithms

4 Conclusions

Predictive Maintenance systems are utilized to predict trends, behavior patterns, and correlations by ML models in order to anticipate pending machine failures in a proactive manner, thus avoiding downtime and production stop. Machine maintenance has therefore attained critical importance for manufacturing industries such as the ones found in Offset Printing, due to the current growth in complexity of the manufacturing ecosystems.

In this study, we proposed a data sampling methodology for predictive maintenance algorithms for Offset Printing environments, which aims to effectively balance data classes and improve the performance of PdM models to accurately identify the minority classes using binary classification. The methodology consisting by the SMOTE, ADASYN, and RUS techniques, and the classification algorithms (DT, LR, KNN, RF), was generated based on a dataset from an Offset Printing company.

Overall, the results of this study indicate that the proposed methodology effectively handles data imbalances while enhancing model performance in classification accuracy, by outperforming other state-of-the-art techniques. Moreover, to the best of our knowledge, ours is the first study to explore PdM systems and data handling approaches for the Offset Printing domain.

Finally, in future work, the proposed methodology can be further extended for multi-class classifications, as well as the evaluation of further ML and DL techniques.