1 Introduction

Normal shaft rotation has a great effect on a machine’s condition. It is very necessary to monitor the status of the shaft axis, normal rotation, and defects to prevent catastrophic failure, loss of human life, and loss of production [1, 2]. Therefore, failure prediction and diagnosis are required to maintain the normal state of a mechanical system. Studies on failure prediction have used data-driven failure diagnosis models based on traditional methods of establishing statistical classification and mathematical models [3, 4]. However, these techniques are not accurate due to the complexity of mechanical systems and diverse environments. In addition, these techniques cannot be used in time-domain systems over time because they depend on statistical measurement methods [5, 6].

To overcome the limitations of prediction methods, studies have been conducted using artificial intelligence. The cost of collecting large amounts of data is decreasing, and the computational power of computers is increasing, so artificial intelligence is beginning to be used in the field of fault diagnosis. Studies on bearing defects in rotating machines have mainly been conducted. After collecting data using sensors, defects have been detected in rotating machines using artificial intelligence [7,8,9,10,11,12,13]. However, operation with misalignment of the shaft axis due to the structure of the rotating machine is subject to a load on the motor coupling connecting the shaft to the rotating motor, as well as defects in bearing wear. If a machine is operated continuously in misalignment, it may cause various defects such as motor overload and rapid bearing wear.

The goal of this study is to determine defects using a support vector machine (SVM), a representative machine learning algorithm, with vibration data from a normal axis and a misaligned shaft axis of a rotating machine. SVM was developed by Vapnik based on statistical learning theory in the 1960s. It has mainly been applied to pattern recognition since the mid-1990s. It has been applied to the diagnostic field since the early 2000s and has shown quite good classification performance.

In previous studies, the fault detect methods in rotating machines were mainly based on the measured electric current data of motors. Statistical approaches and machine learning methods were used to classify defects in the rotating machine by the direct comparison of vibration data and dynamic characteristics. Pandarakone et al. [13] carried out the bearing fault analysis using an identification method of localization and the number of holes considering the physical location of the fault. The frequency domain features are obtained from Fast Fourier Transform (FFT) of load current. The SVM algorithm is used to classify and diagnose the different types of bearing faults, but they did not used the feature extraction step, such as the PCA analysis. Jack and Nandi [14] examined the performance of ANN and SVM as classifiers in two class fault/no-fault recognition examples and the attempts to improve the overall generalization performance of both techniques through the use of genetic algorithm based feature selection process. Nembhard et al. [15] developed to detect faults on different machines by the direct comparison of vibration data from similarly configured machines with different dynamic characteristics operating at different steady-state speeds. But, they did not use machine learning methods. Samanta [16] investigated the gear defects in rotating machines using ANN and SVM. The time-domain vibration signals of a rotating machine with normal and defective gears are processed for feature extraction. Yang et al. [17] conducted a study on the defects of the small reciprocating compressor of the rotating machine and detected the faults through the comparison of standard deviation in the feature extraction. Recently, Hoang and Kang [18] adopt deep learning method based on the measured motor current to detect bearing defects, and conducted a study using the gray image conversion method.

In this study, we introduced a pre-processing method instead of direct use of the time-domain vibration data to increase the accuracy of machine learning prediction. The PCA technique was applied to power spectrum data in the frequency domain to reduce the dimensions and extract features. The SVM method will be used as a machine learning technique with the pre-processed vibration features to find defects quickly. The next section discusses the rotating machine, the data acquisition method, and the proposed signal processing method. Section 3 presents the results of experiments. Our method is then compared with other feature extraction methods, and followed by the conclusion.

2 Methodology

2.1 Rotating Machinery

Figure 1 shows the manufactured rotating machine used in the experiments. Figure 2 shows the motor, coupler, shaft, vibration sensor, rotor discs, adjusting screw, and angle gauge. The shaft offset can be adjusted on the left and right ends of the shaft through the movement screw.

Fig. 1
figure 1

Rotating machine

Fig. 2
figure 2

Components of the rotating machine

Most of studies using a rotating machine were on the reactions of the load on the motor due to cracks or wear of the bearings or imbalance of the rotor disc [13, 18,19,20].

In this study, the core parts are normal, but the study was conducted to predict defects when the shaft axis is displaced during the operation of the rotating machine. The experimental procedure is simple. After setting the shaft at normal or abnormal state as shown in Fig. 4, the vibration data from the acceleration sensor is recorded while the shaft and rotor disc are rotated by the motor. The collected data are processed by FFT to obtain power spectrum then the PCA algorithm is used to reduce dimensions. The pre-processed data is used for machine learning using SVM for the fault detection. Machine learning was conducted using the Python language in Windows OS. For the machine learning, we provided normal vibration data from when the rotor and shaft rotate normally.

The normal and abnormal conditions of the alignment of the shaft can be checked by the shaft alignment gauge shown in Fig. 3. The alignment of the shaft was adjusted through the movement of a screw to collect the vibration data at the normal state (Fig. 4a) and the 1 mm misaligned abnormal condition (Fig. 4b).

Fig. 3
figure 3

Shaft alignment gauge

Fig. 4
figure 4

Normal and abnormal states

2.2 Data Acquisition

To collect data, a vibration sensor was installed at the end of the shaft between the rotor and the shaft axis, as shown in Fig. 5. The vibration signals for both normal and abnormal states were acquired as time series. The rotational speed of the motor was 8 revolutions per second (RPS), and the vibration data were collected from the sensor (MPU6050 gyro sensor) for 100 s with a sampling rate of 1 kHz.

Fig. 5
figure 5

Vibration sensor installed on the shaft end

Figure 6 shows the vibration data of normal and abnormal states collected from the sensor. As shown in Fig. 6, it is difficult to distinguish the states with the naked eye.

Fig. 6
figure 6

Normal and abnormal raw data signals

2.3 Signal Processing

It was impossible to apply the raw data in time domain obtained through experiments to the machine learning algorithm because the classification results cannot show differences between normal and abnormal conditions although negative acceleration peaks are more frequently appeared at the abnormal condition as shown in Fig. 6. However, the machine learning based on pre-processed data gives clear distinctions between normal and abnormal states.

It can be explained that the time domain waveforms of vibration are composed of different amplitude, phase, and frequency. Therefore, the data of time domain cannot be obtained through the y-axis value of PCA because it contains only information about the vibration amplitude value over time. However, by converting time series to the frequency domain through a Fast Fourier Transform, the power value of vibration is different from normal and abnormal condition, and the power spectrum can be used as a pre-processing process to extract information by PCA.

Pre-processing for signal analysis was performed before learning the raw data received by the vibration sensor through the machine learning algorithm. A flowchart of the process is shown in Fig. 7. First, vibration data are collected from the rotating machine. The raw data in the time domain received from the sensor are changed to the power spectrum value in the frequency domain (Fig. 8). Then, features are extracted through principal component analysis (PCA), and machine learning algorithms are taught and tested using the values.

Fig. 7
figure 7

Flowchart of fault diagnosis for a rotating machine

Fig. 8
figure 8

Eight sets of data after classification of input data

In this study, 100,000 row data in the time domain (Fig. 6) were divided into 8 sets with 12,500 data each. The divided dataset was used to change the power spectrum value of the frequency domain. Fourier analysis was used for the change from the time domain to the frequency domain [9, 10].

The collected data are a one-dimensional discrete vector represented by x(n), which contains N sampling points. The discrete Fourier transform X(k) of the discrete signal x(n) can be calculated using Eq. (1).

$${\text{X}}\left( {\text{k}} \right) = \mathop \sum \limits_{n = 0}^{N - 1} {\text{x}}\left( {\text{n}} \right)e^{ - i2\pi kn/N}$$
(1)

The value of k represents the frequency component of the original signal x(n). Therefore, X(k) is a time series x(n) in the frequency domain and includes a complex number that can be decomposed into a sum of a sine and cosine as in Eq. (2).

$${\text{X}}\left( {\text{k}} \right) = \mathop \sum \limits_{n = 0}^{N - 1} {\text{x}}\left( {\text{n}} \right)\left[ {\cos \left( {\frac{2\pi kn}{N}} \right) - i \cdot \sin \left( {\frac{2\pi kn)}{N}} \right)} \right]$$
(2)

X(k) is a complex number, and Real(X(k)) and Image(X(k)) represent the real and imaginary parts of X(k), respectively. The magnitude of every frequency value k is expressed by Eq. (3).

$$|{\text{X}}\left( {\text{k}} \right)| = \sqrt {Re\left( {X_{k} } \right)^{2} + Im\left( {X_{k} } \right)^{2} }$$
(3)

The spectral size of each of the data was calculated through a Fourier transform and extracted as a power spectrum value according to the frequency. Figures 9a and b show the power spectrum of the data collected under normal and abnormal conditions.

Fig. 9
figure 9

Power spectrum of data in Fig. 6

The vibration data were acquired at a laboratory where environmental noise is existed. When we use the law data to obtain power spectra, aliasing problem can be included. A low pass filter was applied to remove the high frequency noise and aliasing. Figure 10 shows power spectra for normal and abnormal states. Abnormal state shows higher power compared to that of normal state. Since the sampling rate is 1 kHz, we conducted the FFT only with signals up to 500 Hz according to the Nyquist sampling criterion.

Fig. 10
figure 10

Power spectra using the low pass filtered signal with 500 Hz cutoff frequency

3 Results and Discussion

PCA was used to extract features from power spectrum values. PCA is a classification method that groups correlated data together by finding the best variance direction in a dataset. It is often used to classify the characteristics of data.

As shown in Fig. 11, the power values represented by 12,500 8-dimensional vectors (a1, a2, a3,…, a8) are reduced to three-dimensional vectors by extracting features according to the correlations. Using PCA, three-dimensional data were applied in SVM. SVM is a powerful machine learning supervised learning model that is widely used for automatic detection of defects and pattern recognition. Most machine learning supervised learning algorithms use all of the training data to train the model. However, SVM defines a decision boundary so that only the support vector is selected from the data points and many of the unnecessary data points can be ignored.

Fig. 11
figure 11

Reduction of data dimension using PCA method

Figure 12 shows the results of when raw data were extracted in two dimensions using PCA without pre-processing of the sensor data. As a result, no distinction into two clusters was made between the abnormal state and the normal state. After that, we applied the SVM algorithm to the data and compared the accuracy. The data used for training and testing were divided at a ratio of 7:3. Of the total of 12,500 data, 8,750 data were used for training, and 3,750 data were used for testing, as shown in Fig. 13.

Fig. 12
figure 12

PCA analysis using time domain

Fig. 13
figure 13

Division of training data and testing data

Figure 14 shows the confusion matrix result from when the SVM algorithm was applied without pre-processing. Of the 3,750 cases, 3,749 cases were predicted as normal. The number of abnormal cases predicted as abnormal was 2, and the accuracy of defect prediction was very low at 49.79%. The reason for this is that the pre-process of finding the data characteristics of normal and abnormal data has not been performed, as shown in Fig. 12. Therefore, it is impossible to distinguish them, even if a machine learning algorithm is used.

Fig. 14
figure 14

SVM result from PCA using raw data

For an accurate comparison experiment, pre-processing was performed using the same data as that used in Fig. 14, and the SVM algorithm was applied. As a result, it can be confirmed that the data are separated into clusters compared to the results of feature extraction using PCA through the pre-processing, as shown in Fig. 15. Among the data used in the test, 3,722 cases were predicted as normal. It was confirmed that 3,688 cases were predicted as abnormal, and the average accuracy was 98.80%, which is higher than that of the without pre-treatment process as shown in Fig. 16.

Fig. 15
figure 15

PCA using the pre-processing data

Fig. 16
figure 16

SVM result from PCA using pre-processed data

Table 1 compares the classification accuracy between another method and the method used in this study. It shows the results of research on various defects in rotating machinery [11,12,13]. No study used the same method, so direct comparison was difficult, but our method showed excellent accuracy. Also, it was confirmed that the proposed method has higher accuracy than the method that uses PCA on the amplitude values in the time domain.

Table 1 Comparison of various defect diagnostic methods

4 Conclusions

Using a vibration sensor, the accuracy of defect determination of a shaft axis was checked in normal and abnormal states. Without pre-processing, it was difficult to classify the defect data even when using an artificial intelligence algorithm, and raw data in the time domain do not contain useful information. However, the PCA technique was applied to power spectrum data in the frequency domain to reduce the dimensions and extract features. As a result, the data could be separated to some extent with the naked eye. When the machine learning algorithm was used, more than 98% of defects were classified accurately.

In the future, research using various algorithms such as deep learning algorithms and machine learning algorithms will be needed. In addition, additional research will be necessary for other techniques and pre-processing in addition to PCA. As a result, the methods could be used in a fault diagnosis system in a real factory.