Keywords

1 Introduction

Failures that occur in unexpected situations are challenging for companies in many respects. Therefore, it is necessary to develop an effective maintenance policy for machines. With the development of the Digital Twin (DT) concept, which is the modeling of the behavior of physical products in real situations and transferring them to the virtual environment, predictive maintenance studies have been operating. Today, DT studies focus on predictive maintenance activities. DT applications can have enough processing power to detect errors, gain new insights, and even determine how to improve each part of the entire system by predicting its behavior under stressful conditions [1]. Companies must predict and prevent malfunctions in machines before they happen, to minimize disruptions in production, and reduce losses caused by malfunctions.

In maintenance planning, an estimation can be done to predict the failure and take precautions. Machine Learning (ML) is one of the most common applications used in industry for predicting failure. ML is a subset of artificial intelligence (AI) that can learn from and make predictions from data and focuses on algorithms. Algorithms such as Support Vector Machine, Random Forest, Naive Bayes, KNN, and Logistic Regression are machine learning algorithms.

Algorithms are used according to the characteristics of the data in the studies. It takes a large number of trials to find the algorithms and parameters that give the best results. As an alternative to this, there are some libraries in the literature to find the algorithm that gives the best result. Automated Machine Learning (AutoML) application has been developed as an important solution for this process. It is known as a system that automates the ML process. It tries ML algorithms with various parameters in the background and returns the best model and best parameter as output. Some of the open-source libraries are; Autokeras, TPOT, Autosklearn, and H2O are libraries. Automated feature selection, Model (Algorithm) Selection (Automatic Algorithm/Model selection), and Hyperparameter Optimization (Optimization of Hyperparameters) are performed with AutoML. In this way, it will be possible to achieve the best result and the acquisition time will be improved. This study aims to contribute to the literature by finding the algorithm that gives the best results among all the methods using AutoML in predictive maintenance.

The general structure of the article is as follows: Sect. 2 provides information about the available literature. Section 3 mentions the AutoML method used in the study. Section 4 makes predictions on machine failures on a dataset with 5 entries. The results are presented in Sect. 5.

2 Literature Review

Methods using predictive maintenance have been investigated in the literature. In general, studies have been carried out on fault detection with machine learning techniques. Machine learning techniques used in studies for predictive maintenance are 33% Random Forest (RF), 27% Artificial neural networks (ANN), 25% Support vector machine (SVM), and 13% K-means algorithms [2]. Panda et al. tackled the problem of reducing long downtimes of a large automotive system using a machine learning algorithm. ML models, three different algorithms such as C5.0, and C5.0 with boosting and classification and regression tree (CART) are used [3]. Santos et al. detected short-circuit faults in induction motors using RF. The results, which consist of two classes, indicate whether there is a short circuit or not [4]. Uhlmann et al. used the k-means algorithm to diagnose laser melting benches by paying attention to 3 factors (temperature, oxygen percentage, and pressure) [5]. Guo et al. propose a performance benchmarking method for residential systems using only smart thermostats as data sources. The proposed method analyzes the so-called steady-state behavior of the system and compares a critical property called the weighted average difference of cooling effort both between systems and within each system [6]. Dangut et al. present a new deep learning technique based on autoencoder and recurrent unit networks with bidirectional gates to process extremely rare failure predictions in aircraft predictive maintenance modeling [7]. Shaheen et al. used combined neural network architectures and cumulative neural network designs to predict the failure state of a mechanical component and estimate its remaining useful life [8]. Einabadi et al. used artificial neural networks to reduce machine downtime with predictive maintenance [9]. Dangut et al. analyzed the log records of error messages on airplanes by using pattern mining methods [10]. Calabrase et al. used a data-driven approach based on machine learning applied to woodworking industrial machinery for a large Italian woodworking company. The predicted failure probabilities were calculated using tree-based classification models (Gradient Boosting, Random Forest, and Extreme Gradient Boosting) [11]. Predictive maintenance applications have been widely used with machine learning techniques. AutoML applications have started to be used to obtain algorithms and parameters that achieve better results.

Vincent et al. examine Bayesian optimization in depth and suggest using the evolutionary strategy of genetic algorithm, differential evolution, and covariance matrix adaptation for acquisition function optimization. The performance of hyperparameter optimization techniques was compared with the help of AutoML models [12]. Škrlj et al. tried to automate simultaneous learning for both images and text. It offers an AutoML (automatic machine learning) approach to automatic machine learning model configuration definition for data consisting of two modalities [13]. Sahin et al. describe a new AutoML framework to predict soil liquefaction potential problems based on stacking community learning (SEL) combined with a greedy search algorithm. The overall concept of the AutoML framework consists of three main steps: data preparation, greedy feature selection, and greedy stacking aggregation [14]. Raj et al. propose the integration of Convolutional Neural Networks (CNN), Vision Transformers (ViT), and AutoML to obtain slice-level predictions as well as patient-by-patient prediction results [15]. AutoML usage area is increasing. There are limited studies on predictive maintenance in the literature.

Using recently collected data from a Portuguese software company client, Ferreira et al. conducted a benchmark comparison study with Supervised AutoML tools and the proposed AutoOneClass method to estimate the number of days until the next failure of a piece of equipment, as well as to determine whether the equipment will fail after a certain period [16]. Cinar et al. monitored autonomous transfer vehicle and electric motor status to detect equipment failures and operational anomalies and were automated with AutoML and workflow automation Technologies [17]. Rivas et al. developed several models with sampling techniques using the partial discharges measurement dataset to evaluate the health of insulation in a general power system on covered conductors from power lines. An AutoML model without a multi-algorithm resampling technique performed better and achieved the best results [18]. Kocbek et al. have shown that AutoML approaches result in better performance compared to competitive winning solutions and have excellent potential to build robust predictive models in the rail industry [19].

3 Theoretical Background

3.1 Maintenance Policies

A maintenance policy is divided into two planned and unplanned maintenance. Unplanned maintenance is a policy based on performing maintenance only in the event of a breakdown. In the event of a malfunction, it is performed by replacing either the part or the whole that caused the malfunction. It is used for equipment that is not a bottleneck, can be easily repaired, or has spare parts. It is a reactive approach. Planned maintenance is divided into 3 as periodic, preventive, and predictive maintenance. Types of Maintenance policies are given in Fig. 1.

Fig. 1.
figure 1

Types of maintenance policies.

Periodic Maintenance.

Periodic maintenance is a maintenance policy made at regular intervals. Equipment and parts are reviewed regularly. In this method, no malfunction is expected for maintenance. In all cases, maintenance is carried out at specified time intervals. The period in which maintenance will be carried out is determined and planned by the authorities of the company.

Preventive Maintenance.

Preventive maintenance includes taking the necessary measures to prevent the malfunction from occurring rather than predicting or detecting it. It is to prevent the occurrence of malfunctions in the initial state. It is aimed to eliminate the fault together with its causes.

Predictive Maintenance.

Predictive maintenance, on the other hand, is a policy where it is decided whether to perform maintenance or not by making various measurements. It detects the risk of failure. There are 3 basic approaches: fault and anomaly detection, estimation of remaining useful life, and fault detection by classification.

The aim is to foresee the time when normal working conditions will be exceeded. The remaining life approach; is divided into three risk-based, fault detection, and threshold value usage. Risk-based is based on planning maintenance activities based on probability distributions obtained from historical data. The probability distribution will depend on the failure rate depending on the amount of work, or it can be determined by considering various environmental factors (temperature, pressure, etc.) that are correlated with the failure rate. Fault analysis is the analysis of the behavior of a component with the failure data of the same or equivalent components to analyze whether there is a situation like the failures seen previously in the component. The threshold value is the control of whether certain characteristics of a system (vibration, current, etc.) are within normal operating limits. Normal operating limits can be determined based on risk, or they can be obtained because of various analyzes. By measuring the time-dependent change in the characteristics of the system, it can be calculated how long it takes for the system to go out of its normal operating limits.

For fault detection (machine learning algorithms, statistical techniques, etc.) estimation tools are used. Predictive maintenance provides a longer operating life than other maintenance strategies and allows for preventive maintenance activities. Predictive maintenance is the approach that emerged because of the digital twin philosophy. The digital twin is the modeling of the system in the physical environment and transferring it to the virtual environment.

3.2 Automated Machine Learning (AutoML)

In studies on data sets, it may not always be possible to find the algorithm that gives the best result among the results. There are many ML algorithms such as Random Forest, Support vector machines, and Artificial neural networks. AutoML is an application that automates the use of these ML algorithms. It is an emerging field where the process of creating machine learning models to model data is automated. There are libraries such as autosklearn, auto-keras, h2o, tpot. A comparison of the AutoML libraries used is given in Fig. 2. Tpot and lazypredict libraries were used in the study.

Fig. 2.
figure 2

Comparison of autoML libraries.

Tpot.

The Tree-Based Pipeline Optimization Tool (TPOT) is one of the first AutoML methods and open-source software packages developed for the data science community. It is a Python-based tool. TPOT, Dr. Randal Olson at the University of Pennsylvania's Computational Genetics Laboratory. Developed with Jason H. Moore. TPOT automates the modeling pipeline with a greater emphasis on data preparation alongside modeling algorithms and model hyperparameters. The goal of TPOT is to automate the creation of machine learning pipelines by combining a flexible expression tree representation of pipelines with stochastic search algorithms such as genetic programming. Not suitable for natural language processing studies.

Lazypredict.

It is a method used to select the best-performing machine learning algorithms. It is one of the best Python libraries for automating ML applications. Many basic models are created without using a lot of code and it helps to understand which models work better without any parameter adjustments.

4 Case Study

4.1 Data Description

A data set of 10000 rows, which contains malfunctions in the machines, was used. There are 5 different factors in this data set, namely air temperature, process temperature, rotation speed, torque, and tool wear. The distributions of the factors are given in Fig. 3.

Fig. 3.
figure 3

Distribution of 5 different factors.

The relationships between the factors were examined. According to Fig. 4, the strongest relationship between air temperature and process temperature is seen.

Fig. 4.
figure 4

Relationships between factors.

As a result of these factors, 6 different failure types were determined. Fault condition number 6; represents the state of no fault. One; Heat Dissipation Failure, 2; Overstrain Failure, 3; Power failure, 4; Tool Wear Failure, 5; It is designated as Random Failure. The distribution of fault types is given in Fig. 5.

Fig. 5.
figure 5

Distribution of faults by type.

4.2 Method Comparison

Two AutoML methods, Tpot and LazyPredict, were utilized within the parameters of this study to estimate the machine defect rates. The findings are shown below, respectively (see Fig. 6).

Fig. 6.
figure 6

Result of tpot autoML.

Figure 6 above displays the Tpot output. The XGBooting Classifier algorithm returns the best accuracy number and displays the parameters of the best algorithm as a consequence of this AutoML library. For instance, the XGBooting algorithm indicated that the learning rate should be 0.5 and the maximum depth should be 4 (Table 1).

Table 1. LazyPredict results of the best ML Algorithms.

The algorithms with the best outcomes after using the LazyPredict AutoML classifier library are displayed in the table above in order of their balanced score. Although the NearestCentroid algorithm has a low accuracy number, it has the best-balanced score value. When F1 and Accuracy values are taken into account, BaggingClassifier, DecisionTreeClassifier, and RandomForestClassifier all produce better results.

5 Conclusion

Malfunctions in machines are very important for companies. They need to develop effective maintenance policies to deal with failures. This study has adopted the predictive maintenance policy that emerged as a result of the digital twin philosophy to predict the failures in the machines. In this study, 5 factors (air temperature, process temperature, rotation speed, torque, and tool wear) affecting the failure status were examined. In the data set consisting of 10000 rows, 6 different types of faults occurred. Fault condition number 6; represents the state of no fault. What's that; Heat Dissipation Failure, 2; Overstrain Failure, 3; Power failure, 4; Tool Wear Failure, 5; It is designated as Random Failure. In the literature, classical machine learning algorithms are generally used in predictive maintenance and fault prediction studies. In our study, AutoML libraries were used to predict failure. Thanks to the use of AutoML, the algorithm, and parameters that give the best results have been obtained. Among AutoML's frequently used libraries, Tpot is preferred. In addition to these, lazy predict, one of the Python open libraries that gives the best results by trying the algorithms, is used.

Contribution of the study to the literature; It is the development of AutoML studies on predictive maintenance and fault detection, which are limited in number, using lazyPredict and different libraries. AutoML is one of the trending applications. At the same time, this study provided diversification of the usage area. In the future, more complex and more factor-affected failure situations will be detected with AutoML in the study. AutoML has more than 10 popular libraries. The results will be enriched by using other libraries in the future.