Keywords

1 Introduction

The aim of condition monitoring is to detect failures in rotating machines before a critical damage occurs. This kind of maintenance has a lot of advantages, because it makes no necessary the dismounting of a machine to check the status of its elements. Besides, the probability of detecting a failure before it becomes critical increases, avoiding losses and making the operations safer. By these reasons, automation of fault diagnosis in industrial processes has been the aim of many researchers in the last decades.

Concretely, rolling bearings are one of the more widely used elements in rotating machinery, and its failure is one of the foremost causes of breakdowns in this kind of machines. Bearings are fundamental elements in the support subsystem, which hold great part of the static and dynamic loads, reason why they have high risks of failure. Most of the researches related to bearing fault diagnosis agrees with the use of vibration signals, due to they contain valuable information about failures [2, 3], however Acoustic Emission (AE) have been also appropriately used with accuracy to diagnose bearings, as in the case of [4].

Based on the use of this kind of signals, most authors classifies the techniques to diagnose bearings in three approaches: time domain based on statistical parameters [5], frequency domain analysis [6], and time–frequency analysis such as Wavelet Transform (WT) [1, 7] and Hilbert-Huang Transform (HHT) [8].

Diagnosis based on time domain statistical parameters has shown low effectiveness when it is applied to incipient faults or when the system is exposed to low loads, as pointed in [9]. By this reason, the use of time domain statistical parameters as unique way to extract features is not common.

The analysis of the frequency domain is the most classical approach to detect failures in rotating machinery, and concretely the Envelope Analysis is the more popular fault diagnosis method of rolling bearing. Envelope analysis means exploiting cyclostationary of second order (CS2) that appears when bearing defects exist [10]. However, this classical tool is seriously affected by the noise, especially in early fault stage. Some studies have been carried out to improve the results of this method, as for example in [11]. In other cases, to solve this problem, the envelope analysis has been combined with other techniques as the Wavelet Transform (WT), as in the case of [12]. Another tool usually applied to examine the frequency domain of the signals is the Empirical Mode Decomposition (EMD), used to obtain Intrinsic Mode Functions (IMFs), as shown in [13].

The Hilbert-Huang Transform (HHT) is a time–frequency analysis technique based in the EMD. The HHT offers high reliability, as in the case of [14].

The same way as HHT, the Wavelet Transform (WT) also offers information both in time and frequency domain, providing the proper treatment both for stationary and for non-stationary signals. WT gives also a multi resolution analysis, so it is especially useful to diagnosis of defects [15]. With this purpose, WT have been widely used and not only for bearings, but also for general rotating machinery as in [16], for gears as shown in [17], for shafts [18], and for structural elements as in [19].

However, the use of the WT is a complex task due to the great diversity of critical parameters which must be chosen, such as the mother wavelet and the decomposition level. On the other hand, until few time ago, the WT had another bigger disadvantage: the incapability for decomposing the high frequency bands trough the Multi resolution Analysis (MRA). Wavelet Packets Transform (WPT) constitutes an improvement of the MRA [20], due to the ability to decompose all the frequency bands. Thus, applications of WPT are highly increasing, and nowadays is the most used technique to treat signals in many fields, as in the case of speech recognition [21], denoising [22], and treatment of electrocardiographs [23], among others.

WPT coefficients can be used directly as features, as they content reliable information about failures [24]. However, other information related to the WPT coefficients can be also used as features, as has been demonstrated in [25], where statistical parameters are calculated, and in [26], where the energy of the WPT is successfully used as crack indicator.

In a diagnosis procedure, after features extraction, an intelligent classification system is also needed. A lot of intelligent classification systems have been developed and used for monitoring systems, as fuzzy classifiers, used in [27, 28], genetic algorithms [18], and the most used, the Support Vector Machines [29, 30] and Neural Networks (NN) [31].

In [1], an algorithm was developed to diagnose four conditions of ball bearings: healthy, inner race fault, outer race fault, and ball fault. The data were obtained at three different rotating speeds: 10, 20 and 30 Hz. The algorithm was based in the use of the MRA coefficients, after selecting of the optimal frequency band (the one where the coefficients presented larger differences between health bearing and the faulty conditions). This coefficients were used to train a Multilayer Perceptron (MLP) NN. With this methodology, high success rates were achieved, obtaining no false alarms, and distinguishing reasonably, for the speeds of 20 and 30 Hz, the healthy bearings from the faulty. However, the MLPs generated had problems to distinguish between different kinds of faults.

The aim of this work is to improve the results obtained from the analysis carried out in [1] working with the same data. The energy of the coefficients of the improved technique WPT will be used to feed, in this case, a Radial Base Function (RBF) neural network.

2 Experimental Setup

The vibration signals were obtained from a rig developed by the UNED mechanical department. FAG 7206 B single ball bearings were tested at three different rotation speeds set to 10, 20 and 30 Hz, and controlled by a photo tachometer. The rig is shown in Fig. 1.

Fig. 1
figure 1

Bench bank used for the measurements. UNED lab

In Fig. 1 the first elements observed, starting on the right hand-side, are axial and radial pneumatic cylinders, which apply loads of 2.5 and 3 bars respectively. Following, the bearings assembly can be seen. A transmission pulley is directly connected to the motor by a V-belt.

The measurement chain is composed by a B&K 4383 accelerometer, a B&K NEXUS amplifier and a DAS-1200 Keithley acquisition card. The sampling rate was set to 5,000 Hz, and all the acquired signals had 5,120 points.

The tests were carried out with healthy bearings. Later several faults were induced to the bearings to carry out the tests, including inner race fault, outer race fault, and ball fault. A pit 2 mm long was artificially induced in the inner or outer race by an electric pen. In the case of the rolling ball, multiple slots in the surface were performed to simulate the flacking phenomenon.

Finally, 284 signals are obtained: 196 signals for each rotation speed, and 49 for each fault condition.

3 Wavelet Packets Transform

Wavelet Transform (WT) is specially efficient to carry out local analysis of non stationary signals. It obtains correlation coefficients between a signal and a mother wavelet function selected. When WT is applied in a discrete way, called Discrete Wavelet Transform (DWT), the signal is decomposed in information of approximation and detail with recursive filters low and high pass. WPT consists on the application of the DWT in a recursive way, until a decomposition level selected, according to the scheme shown in Fig. 2.

Fig. 2
figure 2

WPT analysis, procedure of decomposition in approximation and detail information through low pass filters and high pass filters, until decomposition level 3

where \( W(k,j) \) represents the coefficients of the signal in each packet. \( k \) is de decomposition level and \( j \) is the position of the packet within the decomposition level. Then, each correlation vector \( W(k,j) \) has the structure of the Eq. (1):

$$ W(k,j) = \{ w_{1} (k,j), \ldots ,w_{N} (k,j)\} = \{ w_{i} (k,j)\} $$
(1)

where \( i \) is the position of the coefficient within its packet.

3.1 Energy of the WPT Coefficients for Feature Extraction

The concept of energy used in the WPT analysis is very close to the Fourier Theory [24]. The energy of the packets is obtained from the sum of all the squares of the coefficients of each packet, according to Eq. (2):

$$ Ek,j = \sum\limits_{i} {\{ w_{i} (k,j)\}^{2} } $$
(2)

The relative energy, as a normalized parameter proposed in [26], is calculated as shown in Eq. (3):

$$ Ek,j_{rel} = \frac{Ek,j}{{E_{t} }} $$
(3)

where \( E_{t} \) is the total energy of the signal, calculated as the sum of all the energies of the packets.

3.2 Features Extraction

Using the definition of the energy of the packets described above, the transformations are carried out. The mother wavelet used is the Daubechies 6 (DB6), due to its effectiveness in this area has been already proved in previous related works [1, 15, 16].

The decomposition level has been set to 3. This level was chosen because the better classification results were obtained with this value. The patterns extracted then, are vectors of 8 elements, which seems to be a proper number of inputs for the NN. The decomposition level determines the frequency resolution offered by each packet, that in this case is 312.5 Hz.

At each rotation speed, the features of all the conditions of fault are extracted. An example of the results obtained is shown in Fig. 3.

Fig. 3
figure 3

WPT relative energies (%) at decomposition level 3 with mother wavelet DB6. a Healthy bearing. b Inner race fault bearing. c Outer race fault bearing. d Ball fault bearing

4 Classification System

The architecture of NN used as intelligent classification system is the Radial Base Function (RBF), because it has offered better results than the MLP and the Probabilistic (PNN) in previous related works [15]. RBFs are constituted by three layers of neurons, one of input, one or more hidden and one of output.

RBF architecture has a lot of advantages such as fast training and easy optimization. This is due to the low number of design parameters that must be decided by the designer, where the more critical are the number of neurons in the hidden layer and the activation function.

A critical parameter of the activation function is the spread. The spread is a constant that means the critical distance between the input and the weight vector. When this distance is reached, the output gets a value lower than a threshold.

The optimization of RBF parameters is carried out by a process examining the number of neurons of the hidden layer, and the success rate versus the spread. The value of spread that minimizes the number of neurons in the hidden layer (to reduce the computational cost) and maximizes the success rate is chosen.

5 Results and Discussion

After training several NN to optimize the design parameters, a total number of three RBFs were chosen, one for each rotation speed. Each NN is fed with 49 features of length 8 by each condition at every speed. The number of outputs of each NN is 4, one for each condition. The characteristics of the trainings are presented in Table 1.

Table 1 Design parameters of the RBF at the three rotation speeds

During training process, the algorithm actualizes weight vectors between layers until the sum squared error (SSE) falls beneath an error goal (set to 0.2) or a maximum number of neurons in the hidden layer has been reached (700 neurons).

Success rates obtained at each speed are presented in Table 2, where the best results exposed in [1] are also presented in order to make a comparison.

Table 2 Comparison of the success rates at different speeds with the obtained results in previous related work with the same data

As can be observed, success rates have been increased in the present work. The previous work used the coefficients of a narrow frequency band as features, while in this work uses the energies of the whole signal (specifically the 8 packets generated at decomposition level 3) are selected. The number of inputs has been reduced from 18 coefficients to 8 levels of energy; however the information is related to a wider frequency band. The improvement can be assigned to other effects that occur in the machine when a fault appears, and a narrower band cannot detect.

The improvement of the success rates can also be assigned to the use of WPT in place of MRA, and to the use of the RBF architecture as a substitute for MLP.

From Table 2, it can be stated that the rotation speed of 30 Hz offers the best results both in terms of success rate (with a 92.58 %), and of computational cost, giving the lower number of neurons in the hidden layer with respect to the speeds of 10 and 20 Hz.

In Fig. 4 the partial results of the classification are shown.

Fig. 4
figure 4

Performance (%) of the three different NNs. a Health bearing classification. b Inner race fault classification. c Outer race fault classification. d Outer race fault classification

The increasing of the success rates in this work allows a considerably better discrimination between the kind of fault with respect to the previous work [1], especially in the case of 30 Hz. Regarding the discrimination between health and faulty bearing, results are also better at 30 Hz, where zero false alarms can be found. The probability of not detecting a faulty bearing has been significantly reduced at this speed.

6 Conclusions

With base on the data obtained in the previous related work [1], a new analysis has been carried out to diagnose early faults in ball bearings at three different locations. The methodology has been changed: instead of using the coefficients of a specific frequency band with a MRA analysis, the WPT relative energies of the whole signal have been used. Besides, the architecture of the NN is the RBF, instead of the MLP.

The results obtained in the previous work have been improved. The success rates when distinguishing healthy from faulty bearings have been increased, however the better improvements have been achieved when discriminating between different kinds of fault, where the previous methodology had a serious lack. In the present work, the RBFs can detect with high accuracy the location of the fault at the three speeds.

It can be stated that both the diagnosis and the computational cost of the NN, are improved when the rotation speed increases, so the better solution found in this work is the NN trained with the data obtained at 30 Hz.