Keywords

1 Introduction

Several changes have been happening in the energy sector, namely with the implementation of the smart grid concept [1], having more active participation of electricity consumers in demand response programs [2, 3]. Booming innovation in Big data, data analytics, and Internet of Things (IoT) has resulted in a shift in traditional industrial maintenance strategies to systems capable of forecasting machine lifespan [4, 5]. Furthermore, taking into account energy usage is also critical for optimizing production lines in industrial environments, because machine health can have a significant impact on a machine’s energy efficiency capabilities [6, 7]. Accordingly, it is in these industries’ best interests to implement these systems to minimize energy consumption, reducing not only costs but also contributing to a sustainable future through energy savings. There has been the development of two new maintenance concepts for detecting abnormalities in the production environment: prognostic and health management, as well as condition-based maintenance [8]. Predictive Maintenance (PdM), which analyzes past data to forecast behavior patterns, is frequently used with these two principles in mind, either with prognosis and health management or condition-based maintenance, and in some circumstances, the application of both [9]. The use of predictive systems to determine when maintenance activities are required is critical in a manufacturing environment, not only to avoid wasteful expenses and cut potential Greenhouse gas emissions but also to enhance product quality. According to [10], maintenance expenses can represent 15% to 70% of the cost of manufactured products. Predictive maintenance enables continuous monitoring of the machine’s integrity, allowing maintenance to be performed only when absolutely needed, reducing unnecessary maintenance costs. Moreover, PdM prevents, to some extent, machine breakdowns, which are responsible for the emission of Greenhouse gas emissions in some industrial sectors [11]. Prediction systems that use statistical inference, historical data, engineering methods, and integrity factors allow for early abnormalities detection [12]. Forecasting a machine’s condition and/or lifespan can be done through a variety of techniques, such as Artificial Neural Networks (ANNs) [13,14,15], Random Forests (RFs) [16,17,18], deep learning [19], digital twins [20], support vector machines [21], k-means [22], gradient boosting [23], naive bayes [24], and decision trees [25]. Other noteworthy techniques are presented in [12] as well as in [26].

This paper focuses on the implementation as well as the exploration of the advantages and disadvantages of the two most popular machine learning approaches, according to [12], for PdM: ANNs (27% model employment) and RFs (33% model employment). The prominent use of RF in PdM systems, due to its performance and easy implementation, is the main reason for the exploration of this model in the present paper. Nevertheless, an ANN model has the potential to outperform an RF, both in recall and context adaption, when its hyperparameters are adequately optimized. Furthermore, unlike the RF model, ANNs have the advantage of backpropagation (i.e., fine-tuning of the network’s weights based on the error rate), allowing a current model to be constantly fed with data and improve over time without the need to recreate the model every time there is new training data, which is ideal for manufacturing environments.

The work in [13] proposes an ANN for PdM using the mean time to failure values and backpropagation for adjusting the neuron’s weights. A PdM system for air booster compressor motors is proposed in [14] that employs an ANN with optimized weights and bias by using a particle swarm optimization algorithm. Also using an ANN, the proposed work in [15] focuses on a PdM system for induction motors that optimizes hyperparameters (e.g., number of hidden layers and neurons) to improve performance in the model. For RF, the work in [16] proposes a real-time PdM system for production lines using IoT data. A new PdM methodology, using RF, is proposed in [17] to allow dynamic decision rules to be imposed for maintenance management. A data-driven PdM system applied to woodworking is proposed in [18] using an RF that takes advantage of event-based triggers.

Of the above-cited works, none tackle, to the extent of the present paper, the main problem plaguing PdM problems, imbalanced data. Furthermore, with the exception of the works in [14] and [15], there is little to no optimization regarding hyperparameters, which can improve model performance significantly primarily in imbalanced datasets. Finally, only the work in [16] considers real-time deployment and only the work in [13] takes advantage of the backpropagation feature for retraining. As such, the premise of this paper is to contribute to the progression of the current state-of-art by proposing:

  • An innovative machine learning training approach for PdM that aims to improve model performance while also taking into account imbalanced and irrelevant/erroneous data.

  • An automatic hyperparameter optimization strategy, used to determine the optimal hyperparameters for the ANN and RF, hence enhancing the models’ performance even further.

  • The application in real-time of both implemented models, by taking into account model retraining and user application.

This paper structure is divided into five sections. After this introductory and state-of-art section, Sect. 2 describes the training and testing dataset used to validate the proposed methodology. Section 3 describes the proposed methodology for PdM on an ANN and RF, while Sect. 4 presents the obtained results of the implemented models, as well as a discussion regarding such a topic. The conclusions are presented in Sect. 5.

2 Training/Testing Dataset

The PdM dataset used for training and testing of the proposed methodology was made available from the University of California in Irvine, Machine Learning Repository [27].

The PdM dataset from 2020, labeled “AI4I 2020 Predictive Maintenance Dataset Data Set,” is freely accessible in [28]. The synthetic dataset has 10,000 data points where 339 represent failures and 9661 non-failures data points (i.e., a ratio of 1:28), as presented in Fig. 1. The machine data is the following:

  • Air temperature–defines the exterior temperature of the machine, in Kelvin (K);

  • Process temperature − defines the temperature produced within the machine, in Kelvin (K);

  • Rotational speed–defines the rotational speed of the tools inside the machine, in Revolutions per minute (rpm);

  • Torque–defines the force required to rotate the machine’s tools, in Newton-meters (Nm);

  • Tool wear–defines the amount of deterioration of the tools inside the machine, in minutes until breakdown (min);

  • Machine failure–defines a machine failure status by assuming the value 0 for non-failure and 1 for failure.

The correlation heatmap between the used dataset features is described in Fig. 2.

Fig. 1.
figure 1

Machine failure status bar chart of the used dataset.

Fig. 2.
figure 2

Correlation heatmap between the used dataset features.

It demonstrates that there is a medium positive correlation between machine failure and the features torque (0.190 positive correlation) and tool wear (0.110 positive correlation). On the other hand, the lowest correlation found to machine failure was the rotational speed (0.044 negative correlation) and process temperature (0.036 positive correlation).

3 Proposed Methodology

Two machine learning models, an ANN and an RF model, are implemented and explored for PdM. In the proposed methodology, training of the implemented models can be done in batches, mini-batches, or continuous data streaming. Before real-time training, an initial model is constructed through a dataset, and only then, the training process is carried out in real-time via data streaming or mini-batches.

The initial model for the ANN or RF is constructed using:

  • The dataset described in Sect. 2;

  • The Holdout method, 80% for training and 20% for testing;

  • A Min-Max approach for data normalization;

  • A newly added dataset feature, machine temperature difference (i.e., process temperature − air temperature), replaces the process and air temperature features. It focuses on improving model performance, by reducing the number of inputs for less complexity and better correlation between temperature and machine failure;

  • A data balancing method, 5% oversampling on failure data and a majority undersampling strategy on non-failure data. To achieve this, the imbalanced-learn [29] library was used;

  • A 5-fold cross-validation splitting strategy to search for the best hyperparameters.

It is worth mentioning that, the Holdout method and the 5-fold cross-validation splitting strategy described above were employed as safe measures to prevent overfitting of the models.

The proposed methodology for real-time training begins by obtaining the most recent machine data, described in Sect. 2, from machine databases in the facility. Afterward, prior to training, a data preprocessing phase is employed, which can be divided into six sequential subphases:

  1. 1.

    Data aggregator–combines all acquired data into a single data file;

  2. 2.

    Data normalization–standardizes data units and types between machines, using a Min-Max technique with the MinMaxScaler method [30] from the Scikit-Learn library [31];

  3. 3.

    Data imputation–fills missing values on the obtained data, through a k-Nearest Neighbors imputation approach with the KNNImputer method [32] from the Scikit-Learn library;

  4. 4.

    Data filtering–removes any potentially incorrect or irrelevant data, by detecting outliers using the Z-score with the SciPy stats Z-score method [33] from the SciPy library [34];

  5. 5.

    Data engineering–creates or removes features to better depict the underlying problem;

  6. 6.

    Data balancing–balances machine data failure and non-failure points, with the imbalanced-learn library [29].

Then, the preprocessed data is used to train the machine learning models (i.e., ANN or RF), wherein the ANN neuron weights are adjusted due to the back-propagation feature, or, in the case of the RF, the model has to be reconstructed from the start using the new and past data.

The methodology for real-time application of the implemented machine learning models in a machine can be divided into three phases:

  1. 1.

    Data acquisition–obtains the necessary machine data from the machine to be inspected;

  2. 2.

    Data preprocessing–applies data normalization, imputation, filtering, and engineering on the obtained data;

  3. 3.

    Machine failure status prediction–uses one of the models, designated by the user, to predict the machine failure status (0 for non-failure and 1 for failure).

3.1 Artificial Neural Network Training

The ANN was trained using an automatic hyperparameter optimizer, which focuses on finding the optimal hyperparameter values to obtain a high-performing model. This is achieved by using the GridSearchCV [35] method available from the Scikit-Learn library. The automatic hyperparameter optimizer works by exploring each hyperparameter’s possible values, at random, in order to find a high-performing ANN model, which contains the optimal values for each hyperparameter. Table 1 presents the possible and found optimal hyperparameter values for the ANN model. However, some hyperparameters were predefined, as there was no need to find the optimal value, such as the loss function, defined with binary cross-entropy function, metrics with binary accuracy, the number of input neurons as 4 (temperature difference, rotational speed, torque, tool wear), the number of output neurons as 1 (machine failure), the output layer activation function defined with a sigmoid function, and a normal weight initialization in the hidden layers.

For the ANN classifier, the KerasClassifier [36] from the Keras [37] library was used. It operates through rules created during the training phase to achieve the lowest possible accuracy error in contrast to the training classes. The model is ready to generate predictions once it has been properly fitted using training data.

Table 1. Artificial neural network hyperparameters possible and optimal values.

3.2 Random Forest Training

The RF was also trained using an automatic hyperparameter optimizer, aiming to find a robust RF model. This is accomplished through the RandomizedSearchCV [38] method. This method focuses on determining the optimal estimator to employ in the model by selecting one of the possible values for each hyperparameter at random and then assessing each estimator based on their accuracy scores. Each hyperparameter’s possible values and optimal value for the RF model are shown in Table 2. The RandomForestClassifier [39] was used as the RF classifier.

Table 2. Random forest hyperparameters possible and optimal values.

4 Results and Discussion

Four metrics were used to validate the performance of the proposed machine learning models: recall, precision, f1-score, and accuracy. It is worth noting that, since PdM problems commonly have very imbalanced datasets that have a low number of failure data points, the recall metric was considered to be the most relevant performance metric to validate the proposed methodology. The ANN performance metrics using the optimal hyperparameters found in Table 1 and the performance of the RF model using the optimal hyperparameters in Table 2 are shown in Table 3.

Table 3. Performance metrics of the proposed machine learning models using their respective optimal hyperparameters.

According to the results presented in Table 3, each model has its own benefits and drawbacks, with the ANN being slightly better at predicting when there is about to be a machine breakdown, since it has the highest recall, and the RF excelling at lowering the number of false alarms (i.e., false positives), because of having the highest precision and accuracy scores. As a result, on one hand, if maintenance costs are inexpensive and undetected machine breakdowns can lead to dire consequences, the ANN is the preferred model to be employed. On the other hand, the RF model is better at reducing the number of false alarms, which reduces unnecessary maintenance activities when compared to the ANN. Nevertheless, both models have good accuracy scores, mainly the Random Forest model with 93.15%, for this type of problem, where imbalanced predictive maintenance datasets are common and negatively affect accuracy scores. Table 4 presents the ANN and RF confusion matrixes. It is noteworthy that there is a big trade-off between true positives and false positives between the two models, with the RF only having 1 more unsuccessful machine failure prediction but having 191 fewer false alarms than the ANN. Therefore, in general, even though the recall was considered to be the most relevant metric, the RF model has the best performance overall, since it does not fall behind too much on recall and all other metrics are much better than in the ANN model. It is worth mentioning that another work [40] utilized the same dataset as the current paper to justify the usage of a bagged trees ensemble classifier. However, cited work did not split the dataset for training and testing, resulting in an overfitted model and inflated results, because of this, no comparison was made to this work. Despite the fact that the cited work inflated their obtained results, it achieved a recall score of only 0.71, lower than the present paper’s ANN model with a recall of 0.97 and RF with 0.95.

Table 4. Artificial neural network and random forest confusion matrix.

5 Conclusion

To further reduce costs and improve product quality, the manufacturing industry has been investing in PdM strategies to cut down on unnecessary maintenance costs, as PdM systems are capable of predicting machine condition/lifespan allowing for maintenance-effective manufacturing.

The proposed methodology aims to improve performance in machine learning models for PdM problems by proposing a novel training methodology, an automatic hyperparameter optimization strategy, and a new retraining method. To achieve this, an ANN and RF models are implemented and explored. A synthetic dataset for PdM, containing imbalanced data, is presented to validate the proposed methodology.

The obtained results show the robustness of the proposed methodology, with the ANN model accomplishing a recall of 0.97, a precision of 0.15, an f1-score of 0.27, and an accuracy of 83.65%. The RF model was able to excel even further by achieving a lower recall of 0.95, but having a much better precision of 0.30, an f1-score of 0.46, and an accuracy of 93.15%. In general, the RF model has better performance overall, nevertheless, it is clear that the ANN is slightly better at reducing true positives while the RF reduces false positives.

Future work will address the use of real-world data instead of a synthetic dataset, allowing to better evaluate the effectiveness of the proposed methodology in practical manufacturing environments. In addition model interpretability, through eXplainable Artificial Intelligence (XAI), will also be explored for the proposed ANN and RF models, in order to improve confidence in PdM systems.