Optimal Modifications in CNN for Bearing Fault Classification and Adaptation Across Different Working Conditions

Ruan, Diwang; Zhang, Feifan; Zhang, Luxi; Yan, Jianping

doi:10.1007/s42417-023-01106-0

Optimal Modifications in CNN for Bearing Fault Classification and Adaptation Across Different Working Conditions

Original Paper
Published: 01 September 2023

Volume 12, pages 4075–4095, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Vibration Engineering & Technologies Aims and scope Submit manuscript

Optimal Modifications in CNN for Bearing Fault Classification and Adaptation Across Different Working Conditions

Download PDF

Diwang Ruan¹,
Feifan Zhang²,
Luxi Zhang² &
…
Jianping Yan³

212 Accesses
2 Citations
Explore all metrics

Abstract

Introduction

Convolutional neural network (CNN) has been widely used in bearing fault diagnosis and many satisfying results have been reported. As a typical CNN network, the LeNet-5 was improved from three aspects to further enhance its diagnosis performance in this paper.

Methods

Firstly, eight hyperparameters were optimized by particle swarm optimization within the predefined discrete parameter value sets. Secondly, envelope spectrum and feature vector were adopted as replacements for the original signal input. The feature vector consisted of 157 manually extracted features from time and frequency domains. Thirdly, support vector machine, decision tree and random forest were applied to replace the default fully connected layer. An overall evaluation method was also proposed in terms of classification accuracy, stability, robustness to noise and computing efficiency.

Experiments

Based on the Case Western Reserve University bearing dataset, two experimental cases were designed from four different working loads. In case 1, the training and test datasets of each load were individually collected from the corresponding working load. Based on the overall evaluation method introduced, the optimal modification methods were identified in terms of hyperparameters, input type and fully connected layers. The contributions of modification to CNN in the performance improvement were quantitively discussed and compared. In case 2, the optimized CNN was trained with the dataset from one working load and tested with the other three different working loads, which resulted in a sharp reduction of accuracy. To address this problem, multi-convolutional layer, data augmentation and signal concatenation were proposed and adopted individually as well as collaboratively to improve the CNN’s ability in the working condition adaptation.

Conclusion

Experimental results confirmed that all the three approaches effectively enhanced the CNN’s performance. The combination of two or three approaches has better performance than the individual one.

Bearing Fault Diagnosis Based on Artificial Intelligence Methods: Machine Learning and Deep Learning

Article 18 August 2024

Fault Diagnosis of Rolling Bearing Under Variable Working Conditions Based on CWT and T-ResNet

Article 15 November 2022

A bearing fault diagnosis model based on CNN with wide convolution kernels

Article 02 April 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

With the rapid development of modern industries, there is an increasing demand for higher safety and reliability of intelligence and integration mechanical systems. The prognostic and health management (PHM) technology, as a promising approach to meet the above demands, has been receiving increasing research attention in recent years [1]. As a fundamental support component in rotating machines, the performance of rolling bearings directly affects the reliability of equipment. Its failures may result in huge damage, economic loss and human safety. Therefore, reliable fault diagnosis and predictive maintenance of bearings are meaningful and practical. The machine learning and deep learning [2, 3] as typical data-driven methods for effective intelligent fault diagnosis have been established and attracting more and more attention from both academia and industry. Among the various deep learning networks, CNN has been the most used due to its powerful ability in feature extraction and nonlinear mapping, many satisfying results have been achieved [4, 5]. Most published researches only focus on the performance improvement [6, 7], but very few works study the impacts of hyperparameters, feature extraction or structure modification. To study these impacts is the first task of this research. To date, most published works for the CNN-based fault diagnosis are based on such prerequisite that the training and test datasets are from the same distribution. For example, the training and test datasets are required to come from the same bearing test bench under the same working condition [8, 9]. If the test dataset does not appear in the training phase, the CNN’s performance on test dataset is usually sharply reduced [10]. However, in practical industrial applications, CNN certainly has to deal with the measurement data that has never appeared in the training process [11]. For example, a bearing test bench usually works under various conditions and probably even under a transient cycle [12, 13]. Only limited data under certain conditions are available in the training process. Therefore, when trained with the data from one condition but tested under other different conditions, how to guarantee the CNN’s performance has become a hurdle. This defines the second motivation for this research.

To address the two research gaps mentioned above, the LeNet-5 is chosen as a CNN benchmark in this study. The influence of hyperparameter, input and fully connected layer on the performance of CNN is discussed, and the optimal modifications from these three aspects are identified. Firstly, the particle swarm optimization (PSO) is applied to optimize the hyperparameter of CNN. Secondly, the different features from both time and frequency domains are extracted and fed into CNN rather than the commonly used original signal. Thirdly, the fully connected layer is replaced by machine learning methods to further improve the CNN’s accuracy. An overall evaluation method is proposed to determine the best modification in terms of classification accuracy, stability, robustness to noise and computing efficiency. The optimized CNN is compared with the traditional one. With respect to the working condition adaptation, three different methods, namely the multi-convolutional layers, data augmentation and signal concatenation, are proposed to enhance the CNN’s performance. These three approaches are explored individually as well as collaboratively, and their performances are compared and discussed in details. The experimental data from the Case Western Reserve University is used as the data source. Validation results confirm the effectiveness of the proposed methods.

The remainder of this paper is organized as follows. “Test bench” describes the test bench description and data processing. “Methodology of modification” presents the methodology of PSO, feature extraction, modification of fully connected layer and four indicators for an overall evaluation. “Case study and results analysis” validates the modified CNN with two design cases. In case 1, the training and test datasets are collected from the same working condition, while in case 2, the CNN is verified with the test datasets that differ from the training one. “Case study and results analysis” introduces the proposed methods to solve the problem in the working condition adaptation. “Conclusion” concludes the whole paper.

Test Bench

Experimental Setup

The bearing experimental data used is taken from the CWRU bearing dataset center [14]. The test bench shown in Fig. 1 comprises of a motor, a torque transducer/encoder, a dynamometer, control electronics and test bearings that support the motor shaft. Accelerometers are attached to the housing to collect the vibration data. There are three failure types (ball fault, inner race fault and outer race fault) aside from normal bearing. Each failure type has three different fault diameters (0.007 in., 0.014 in. and 0.021 in.) and four different load states [0 HP (horsepower), 1 HP, 2 HP and 3 HP]. There are totally 10 types of different bearing conditions. The drive end fault data with a sampling frequency of 12 kHz is used to validate the proposed CNN described in the following text.

Data Processing

The whole dataset needs to be rescaled into the range of [− 1, 1] and then split into small frames. Each frame is taken as a sample. Due to the limited data provided, the dataset is divided with overlap [15]. Supposed the total length of the original signal is L, the length of each small frame is l and the shift between two data frames is $\tau$, then the number of data frames n can be computed as follows:

$$\begin{aligned} n=\textrm{floor}\left[ \left( \frac{L-\tau }{l}\right) +1\right] . \end{aligned}$$

(1)

In this work, l, $\tau$ are 4096 and 500 respectively. L differs from dataset to dataset. The approach is illustrated in Fig. 2. Subsequently, all the samples are split into the training and test datasets with a ratio of 7:3. The size of the training and test datasets under each label is summarized in Table 1.

Table 1 Bearing data description

Optimal Modifications in CNN for Bearing Fault Classification and Adaptation Across Different Working Conditions

Abstract

Introduction

Methods

Experiments

Conclusion

Similar content being viewed by others

Bearing Fault Diagnosis Based on Artificial Intelligence Methods: Machine Learning and Deep Learning

Fault Diagnosis of Rolling Bearing Under Variable Working Conditions Based on CWT and T-ResNet

A bearing fault diagnosis model based on CNN with wide convolution kernels

Explore related subjects

Introduction

Test Bench

Experimental Setup

Data Processing

Methodology of Modification

Introduction of LeNet-5

Particle Swarm Optimization

Feature Extraction

Envelop Spectrum

Feature Vector

Time-Domain Features

Frequency-Domain Features

Feature Vector Construction

Modification of Fully Connected Layer

Definition of Four Indicators

Case Study and Result Analysis

Case 1: Training and Test Datasets from the Same Working Load

Results of Three Modifications for CNN

Hyperparameters Optimization

Comparison of the Three Kinds of Input Data

Modification for the Fully Connected Layer

Overall Validation of the Proposed Model

Analysis of the Three Modifications for CNN

Case 2: Training Dataset from One Working Load and Test Dataset from Other Working Loads

Problem Formulation

Methodology for Condition Adaptation

Multiple Convolutional Layers (MCL)

Data Augmentation (DA)

Signal Concatenation (SC)

Results of Working Condition Adaption

Analysis of Working Condition Adaption

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation