1 Introduction

Computerized numerical control (CNC) machine tools are typical complex repairable mechanical and electrical products which are widely used in industrial production. A CNC machine tool undergoes many different faults during its service period. In general, the fault rate curve of the typical product such as the CNC machine tool can be described by the bathtub curve, shown in Fig. 1 [1, 2]. The life cycle of a CNC machine tool can be divided into three phases: early faults, faults with constant (or nearly so) fault rate, and wear-out faults. In the early fault phase, the fault rate λ(t) of machine tool decreases sharply with time. Early faults of products have plagued many machine tool companies, especially middle-small enterprises limited by capital. Early faults of CNC machine tools should be eliminated before they are sold to users, while the middle-small enterprises may not afford to carry out expensive reliability tests to excite the early faults of products in the development phrase. The result is that early faults occur in the customers’ factory, which will not only bring great economic losses to the customers but also increase the maintenance cost of machine tool enterprises. In addition, customers may lose the interest of the CNC machine tools and switch to products from other companies. Thus, to improve reliability of products and satisfaction of users, the early fault elimination activities of CNC machine tools are very necessary for machine tool enterprises.

Fig. 1
figure 1

A typical bathtub curve of products

In recent years, several works have addressed the early faults of products. Aydin et al. [3] developed a new early fault detection and diagnosis method of broken rotor bars in three-phase induction motors. They used Hilbert transforms of one of the three-phase currents to construct a sliding window for detecting the early faults and then gave a fault diagnosis method. Niu et al. [4] proposed some new statistical moments for the early faults detection of the rolling element bearing (REB). Bangalore and Tjernberg [5] presented an early fault detection approach of gearbox bearings in the gearbox and introduced an artificial neural network-based condition monitoring approach. Li et al. [6] used the improved reassigned wavelet scalogram to diagnose the early fault of REB. They used real vibration signal acquired from REB to verify the proposed method. Hu [7] and Wan [8] also studied the early faults of REB. The above researches are mainly focused on early faults detection of components while there are few studies on the early faults of CNC machine tools. The early faults of components can be detected by component reliability test with lower cost. While for the middle-small machine tool enterprises, the reliability test of the whole machine tool will not be performed due to the expensive cost. In this paper, we propose an early fault elimination method that does not take too much expense to conduct long-term reliability tests.

The proposed method mainly includes two existing fault analysis methods. The first method is the fault mode and effects analysis (FMEA). It is an inductive and bottom-up method that is used to find out all possible fault modes of the analysis object. It can be extended to a quantitative FMECA (fault mode, effects and criticality analysis). Many scholars use the FMEA/FMECA method to conduct the fault analysis of different objects [9,10,11,12,13,14,15,16,17]. The second method is the fault tree analysis (FTA) method, which is a top-down method. Proposed by Watson at Bell Laboratories in 1962, the FTA method is an effective fault cause tracing method [18]. A typical fault tree (FT) is a logical graph composed by events and logic gates and the analysis objects usually are undesired faults or disaster events (top event). As an important tool for safety and reliability analysis, the FTA method is also widely used in various fields [19,20,21,22,23,24,25,26,27,28]. To combine the strengths of both methods, a mixed approach of FTA and FMECA has attracted the interest of scholars. For instance, Peeters et al. [29] proposed a more efficient approach by combining FTA and FMEA in a recursive way. They used system-function-component decomposition method and conducted the fault analysis by the FMEA and FTA in every level. Azadeh et al. [30] considered the market’s feedbacks in the fault tree analysis to determine product faults and defective components. Then, they integrated the FTA and design fault modes and effects analysis (DFMEA) to improve product configuration and meet the costumer’s demands. However, both methods are time-consuming to perform thoroughly. In this paper, we propose a mixed method that is a combination of FTA and FMECA. To improve the analysis efficiency, we treat specific early fault events as the top events of the fault trees. We can get the possible fault cause set (basic event set) by FTA. Then, we use the FMECA method to analyze the basic event set and help the machine tool companies to eliminate early faults. There is a problem that how we can recognize the early faults from the after-sale data. That is, we should know how we can determine early fault period shown in Fig. 1.

To determine the early fault period, we need to determine the fault process model according to the fault data of CNC machine tools. Most of the present fault process models of CNC machine tools are based on the assumption that a machine tool is as good as new when a repair is completed, such as Literature [31, 32]. However, local maintenance activities are performed to restore the operation of machine tools when faults happen, and thus reliability of a CNC machine tool shows as good as old when a repair is completed, which is also termed as minimal maintenance [33]. The fault process of the repairable system in the minimal maintenance can be described by non-homogeneous Poisson process (NHPP) [34]. The superimposed model of two NHPP models can represent the fault process that can be described by a bathtub curve [35]. Pulcini [36] proposed a superposed power law process (S-PLP) model of the repairable product and presented a graphical approach to determine whether the proposed model can adequately describe a given fault date. Jiang [37] proposed a novel bathtub model with a finite support to describe the sharp change of failure rate for nonrepairable product in the wear-out phase. However, the fault information of a single machine tool cannot accurately describe the reliability of a series of machine tools in machine tool enterprises.

In this paper, we study the bathtub curve model of several machine tools and propose a four-parameter NHPP model to describe the fault process of CNC machine tools. The early fault period can be determined by solving the proposed model, and then the early faults are found. Then, we use a mixed method to perform the fault analysis. In the hybrid method, we take the identified early faults as the top events to construct the fault trees and then use the FMECA method to analyze the fault causes obtained by the FTA method. Finally, fault elimination measures are taken to improve reliability of CNC machine tools.

1.1 Early fault elimination method

The proposed early fault elimination method can be shown in Fig. 2. It includes four members shown with the colored blocks in Fig. 2. The first step is to collect fault data mainly including CNC machine tools information (machine number, type, etc.), fault time, fault mode, operation condition, etc. In this paper, the fault data are mainly from after-sale data (the customer field data). In the second step, we build a four-parameter NHPP model to determine the early fault period and early fault modes. In the third step, we present a mixed fault analysis method as the combination of the FTA and FMECA method. The early faults are performed as the top events in the FTA. Then, the criticality and importance of basic events are determined by the risk priority number (RPN) in the FMECA. Based on results of the mixed fault analysis, the early fault elimination measures such as design improved and materials changed can be taken as the final actions.

Fig. 2
figure 2

Early fault elimination model of CNC machine tools

1.2 Fault data collection

Reliability data of products are the basis of reliability activities [38]. Artificial collection and system collection are two main ways to collect reliability data of CNC machine tools. When the data collection system in a company is not perfect, the first data collection way will play an important role. The second data collection way presents more efficiently and accurately because a variety of data have specified flow channels with canonical formats in the sound data management system [39]. Reliability data of CNC machine tools in general can be divided into two types: test data and field data. Test data refers to the data obtained from the reliability experiments carried out in the laboratory. Field data also called outfield data refers to the reliability data obtained from customers’ factories.

In this paper, the early elimination activities are performed by using the after-sale fault data (field data) of CNC machine tool enterprises. The after-sale data in our partner company mainly is collected in two ways: field maintenance records and customers’ feedback. The two situations can be described as follows:

  1. 1.

    Field maintenance records. When the company receives the maintenance request from the customer, maintenance personnel will be dispatched for maintenance activities. After the repair work, they record the repair information and submit it to the after-sale department.

  2. 2.

    Customers’ feedback. Customers perform some maintenance activities and record the information. Due to good cooperation, they transfer the data to the after-sale department of the machine tool company.

Thus, the data in this article is mainly collected manually. Meanwhile a data management system is also applied.

1.3 A four-parameter NHPP model of CNC machine tools

To find out early faults among the collected fault data of CNC machine tools, we need to establish the fault process model and determine the early fault period. Under the minimum maintenance situation, the NHPP model has been proven to be an effective model for describing the fault process of repairable products. In this paper, we treat fault events as random points and then use the NHPP model to represent the fault process of CNC machine tools. The fault rate of CNC machine tool is generally presented as the shape of the bathtub curve, while the curve may not have a monotonous trend [40]. So, we improve the NHPP model. The basic parameter of NHPP is the intensity function λ(t). The superposition of multiple independent non-homogeneous Poisson processes is still a NHPP [41], so we give the following expression of the fault intensity function of a CNC machine tool:

$$ \lambda (t)={\lambda}_1(t)+{\lambda}_2(t) $$
(1)

where λ1(t) is the intensity function of the early fault period and λ2(t) is the intensity function of the accidental fault period in the bathtub curve.

According to the Literature [36], we know that the early fault period can be described by the power law process (PLP) model, while the intensity function of the model near t = 0 will change greatly. So, we use log-linear process (LLP) model to describe the early fault phrase of the machine tool referring to Literature [42]. The fault intensity function λ1(t) can be obtained by:

$$ {\lambda}_1(t)=\mathit{\exp}\left({\alpha}_0+\beta t\right)=\kern0.5em \alpha {e}^{- bt}\kern1.1em \alpha, \beta >0\kern1.8em t\ge 0 $$
(2)

According to the Literature [33], we can get the fault intensity function of the accidental fault period:

$$ {\lambda}_2(t)=\frac{\gamma }{\eta }{\left(\frac{t}{\eta}\right)}^{\gamma -1}\kern2.5em \gamma >1\kern1.2em \eta >0\kern1em t\ge 0 $$
(3)

Substituting Eq. (2) and Eq. (3) into Eq. (1), we can get the fault intensity function of the CNC machine tool as follows:

$$ \lambda (t)=\alpha {e}^{-\beta t}+\frac{\gamma }{\eta }{\left(\frac{t}{\eta}\right)}^{\gamma -1}\kern0.70em \alpha, \beta, \eta >0\kern0.5em \gamma >1\kern0.5em t\ge 0\kern0.2em $$
(4)

where α, β, γ, η in the formula are all parameters to be determined.

When the fault time in (0, t], the cumulative fault intensity function of the CNC machine tool can be calculated by the following equation:

$$ F(t)={\int}_0^t\lambda (x) dx=\frac{\alpha }{\beta}\left(1-{e}^{-\beta t}\right)+{\left(\frac{t}{\eta}\right)}^{\gamma } $$
(5)

Supposing that there are m identical CNC machine tools, Ti represents truncation time of the ith CNC machine tool (0 < i < m), and we have ni faults in the time interval (0, Ti). The notation tij is the time that the jth fault of the ith CNC machine tool (0 < j<ni). The maximum likelihood parameter estimation is used to estimate the unknown parameters of the fault process model of m machine tools. The joint probability density likelihood function of m machine tools is as follows:

$$ L=\prod \limits_{i=1}^m\prod \limits_{j=1}^{n_i}\left\{\kern0.1em \left[\alpha {e}^{-\beta {t}_{ij}}+\frac{\gamma }{\eta }{\left(\frac{t_{ij}}{\eta}\right)}^{\gamma -1}\right]\times \exp \left[-\frac{\alpha }{\beta}\left(1-{e}^{-\beta {T}_i}\right)-{\left(\frac{T_i}{\eta}\right)}^{\gamma}\right]\right\} $$
(6)

and logarithmic transformation of Eq. (6) is:

$$ l=\ln L=\sum \limits_{i=1}^m\sum \limits_{j=1}^{n_i}\ln \left[\alpha {e}^{-\beta {t}_{ij}}+\frac{\gamma }{\eta }{\left(\frac{t_{ij}}{\eta}\right)}^{\gamma -1}\right]-\sum \limits_{i=1}^m\left[\frac{\alpha }{\beta}\left(1-{e}^{-\beta {T}_i}\right)+{\left(\frac{T_i}{\eta}\right)}^{\gamma}\right] $$
(7)

The total number N of failures of m CNC machine tool during the truncation time can be expressed by:

$$ N=\sum \limits_{i=1}^m{n}_i=\sum \limits_{i=1}^m\left[\frac{\alpha }{\beta}\left(1-{e}^{-\beta {T}_i}\right)+{\left(\frac{T_i}{\eta}\right)}^{\gamma}\right] $$
(8)

and then from Eq. (8), we can get:

$$ \alpha =\frac{\sum \limits_{i=1}^m\left[{n}_i-{\left(\frac{T_i}{\eta}\right)}^{\gamma}\right]\beta }{\sum \limits_{i=1}^m\left(1-{e}^{-\beta {T}_i}\right)} $$
(9)

Substituting Eq. (8) and Eq. (9) into Eq. (7), we can get:

$$ l=\ln L=\sum \limits_{i=1}^m\sum \limits_{j=1}^{n_i}\ln \left[\frac{\left[{n}_i-{\left(\frac{T_i}{\eta}\right)}^{\gamma}\right]\beta }{1-{e}^{-\beta {T}_i}}{e}^{-\beta {t}_{ij}}+\frac{\gamma }{\eta }{\left(\frac{t_{ij}}{\eta}\right)}^{\gamma -1}\right]-N $$
(10)

To ensure the validity of the Eq. (10), we should ensure the nonlinear constraint:

$$ \alpha =\left\{\sum \limits_{i=1}^m\left[{n}_i-{\left(\frac{T_i}{\eta}\right)}^{\gamma}\right]\beta \right\}/\sum \limits_{i=1}^m\left(1-{e}^{-\beta {T}_i}\right)>0. $$

According to the value ranges of parameters, the nonlinear constraint can be transformed into \( \eta >{\left(\sum \limits_{i=1}^m{T}_i^{\gamma }/\sum \limits_{i=1}^m{n}_i\right)}^{1/\gamma } \).

Finally, the problem of the maximum likelihood parameter estimation of Eq. (7) can be transformed into the maximization problem of the Eq. (10) under the nonlinear constraint condition. The problem-solving model is as follows:

$$ {\displaystyle \begin{array}{l}\min -\sum \limits_{i=1}^m\sum \limits_{j=1}^{n_i}\mathrm{In}\left[\frac{\left[{n}_i-{\left(\frac{T_i}{d}\right)}^{\gamma}\right]\beta }{1-{e}^{-\beta {T}_i}}{e}^{-\beta {T}_{ij}}+\frac{\gamma }{\eta }{\left(\frac{t_{ij}}{\eta}\right)}^{\gamma -1}\right]+N\\ {}s.t.\left\{\begin{array}{l}{\left(\sum \limits_{i=1}^m{T}_i^{\gamma }/\sum \limits_{i=1}^m{n}_i\right)}^{1/\gamma }-\eta <0\\ {}\beta >0\\ {}\gamma >1\\ {}\eta >0\end{array}\right.\end{array}} $$
(11)

Solving the nonlinear programming problem in Eq. (11) by MATLAB 2010, we can get the unknown parameters and the fault process model of CNC machine tools. The turning point t1 of the bathtub curve can be obtained according to the determined model, and thus we confirm that the faults in time interval (0, t1)belong to the early faults.

1.4 A mixed fault analysis method

FTA method is a good way to investigate the fault behavior. It takes an outcome event or a fault mode (top event) as the analysis object and investigates the possible fault causes by the deduction graph (see Literature [18] for more details about FTA method). In this paper, we determine the early fault period through the fault process model of the CNC machine tools and take early faults as the top events of the fault trees to find the early fault causes. A typical fault tree is shown in Fig. 3.

Fig. 3
figure 3

A typical fault tree

The fault tree in Fig. 3 consists of some events (top event, intermediate events, and basic events) and standard logic symbols (AND, OR, etc.). A fault tree has a top event, that is, an early fault mode has a fault tree. The intermediate events are those possible faults that may cause the early fault. Early faults are mainly due to the weaknesses in materials, components, or production processes [1]. When the analysis shifts to the deepest level: the fault mechanism analysis, we can know the real root causes of early faults. To help the designers, maintainers, and other members in the company, the basic events should be examined by their criticality. So, we use the FMECA method to assess their criticality. In FMECA, the risk priority number (RPN) is used to assess the criticality of the projects. RPN is calculated by three indicators: effect severity ranking (S), occurrence probability ranking (O), and detection difficulty ranking (D). The effect severity ranking of a fault refers to the seriousness of the effect on higher level components, subsystem, or system. It ranks from low effect to very high effect. The occurrence probability ranking refers to the probability that the fault happens. It is rated from very little to almost inevitable. The detection difficulty ranking refers to the likelihood that the fault can be detected. It is rated from absolute uncertain to almost certain. The three indicators are all given the ranking level from 1 to 10. For more details, see Literature [44]. The calculation of RPN seems slightly naive. To alleviate this weakness, some scholars improve the calculation method of RPN. For instance, Xiao et al. [45] extended the definition of RPN and proposed a weighted RPN calculation approach. Wang et al. [46] proposed a FRPN method. They treated the risk factors S, O, and D as fuzzy variables and then assessed them by fuzzy linguistic terms and fuzzy ratings. A multiple-criteria decision-making method is also applied in the calculation of RPN. These advanced calculation methods improve the RPN. However, they also increase the calculation cost when analyzing many faults. So, in this paper, we use a simple and proven method to calculate the RPN (where RPN=S × O × D) in FMECA.

When we get the basic events of an early fault by the fault tree, the next step is to conduct the FMECA of these basic events. These basic events are mainly fault mechanisms, which are the deepest fault analysis levels. Therefore, the FMECA in this paper formally cannot be called the traditional FMECA because the analysis is based on the fault causes (fault mechanisms) rather than fault modes. However, the analysis process is the same as a traditional FMECA and the RPN of each basic event (fault cause) can also be determined by the above three indicators. Finally, the criticality of these fault causes can be evaluated.

1.5 Early fault elimination measures

After the above fault analysis activities, the company needs to take some measures to eliminate the early faults and improve the reliability of CNC machine tools. Early faults of machine tools may be caused by design defects, manufacturing defects, material defects, etc. Therefore, the early fault elimination activity is a systematic work involving various departments of the company. In FMECA, we can know which department should be responsible for every fault cause. Every department should take actions according to the results of FMECA table. For example, the design department needs to carry out design improvement of CNC machine tools; the manufacturing department needs to optimize the manufacturing process; the purchasing department needs to control purchased components strictly. Systematic elimination activities should be performed in the machine tool company. In addition, the early faults of the machine tool may be caused by the user’s problems such as unreasonable operating environment. Therefore, the machine tool company should help the customer to use the CNC machine tools according to the instructions.

2 Case study

In this paper, the fault data of CNC machine tools is collected from the after-sale data of Baoji Machine Tool Group Co., Ltd. in China. From the view of mathematical statistics, the more the data is collected, the better the analysis result will be. But for middle-small machine tool company, the batch size is generally small. Moreover, some collected data may be invalid due to human errors. Thus, referring to the sampling principle of field data in [47], we select the fault data of four CNC lathes in the same type as the example to illustrate the proposed method. The fault data are given in Table 1.

Table 1 Fault data of CNC lathes

The cumulative faults vs. time plot proposed by Literature [43] is used to examine the fault trend of CNC lathes based on the fault data in Table 1. As shown in Fig. 4 and Fig. 5, we find that the fault trends of the CNC lathes are non-monotonic, which indicates that it is feasible to use the bathtub curve model to analyze the reliability of the CNC lathes.

Fig. 4
figure 4

Cumulative faults vs. time plot of four CNC lathes

Fig. 5
figure 5

Cumulative fault points vs. time of each CNC lathe

We assume that the fault process model of the four CNC lathes can be described by Eq. (4) with the same parameters. According to Eq. (11) and the fault data in Table 1, the parameter estimation values can be shown in Table 2. However, due to various random factors and different operation conditions in the actual production, the fault process of each CNC lathe tool is not exactly same. Therefore, we use a model with different parameters to describe the fault process of different CNC lathe. If we consider that each CNC lathe has a fault process model with its own parameter, the parameter estimation values of each CNC lathe can be solved by Eq. (11) using the fault data in Table 1, also shown in Table 2.

Table 2 Parameter estimation values of the proposed model for CNC lathe

Using the data in Table 2, we can get the fitting cumulative fault intensity function of each machine with different parameters by Eq. (5). The comparison between the fitting curve and actual cumulative fault can be shown in Fig. 6.

Fig. 6
figure 6

Comparison between the fitting curve and actual fault points

In Fig. 6, we can see that the fitting cumulative fault intensity function of each machine with different parameters can fit the actual fault points well. Then, we use the fitness index r in Literature [48] to test the fitness between the fitting curve and the actual data.

$$ r=1-\sqrt{\sum \limits_{i=1}^n{\left({n}_i-{\hat{n}}_i\right)}^2/\sum \limits_{i=1}^n{n}_i^2} $$
(12)

where ni refers to the actual cumulative fault number at time i, \( {\hat{n}}_i \)represents the expected cumulative fault number at time i, and r represents the goodness of fit of the data. If the value of r is larger, the model fits the actual data better.

Using the data in Table 1and Table 2, the goodness of fit r can be calculated by Eq. (5) and Eq. (12), as shown in Table 3 Then we can know the fitting curve fits the actual fault data well.

Table 3 Goodness of fit r of each CNC lathe

The fault intensity curve of each CNC lathe can be got by Eq. (4) based on the parameter values in Table 2, shown in Fig. 7.

Fig. 7
figure 7

The fault intensity curve of each CNC lathe

In Fig. 7, the fault intensity curve of each CNC lathe is a single valley function, and the minimum value can be calculated by MATLAB 2010. The time t1 of bathtub curve can be shown in Table 4.

Table 4 Time t1 of bathtub curve

The faults in time interval (0, t1) can be considered as early faults. From Table 4, we can see that the early fault period of each CNC lathe is different. For example, the early fault period of CNC lathe 1 is (0, 2135 h), which means that the faults in this time interval are early faults, and then we can know that there are five early faults for CNC lathe from Table 1. However, the fault data of many CNC machine tools cannot be got because of the data collection cost. So, we build the fault process model by the data collected to indicate a batch of CNC machine tools with same type. In this case, we use the expectation of early fault periods for the four CNC lathes to assess the same type CNC lathes (see Eq. (13)).

$$ E\left({t}_1\right)=\left(\sum \limits_{i=1}^m{t}_1^i\right)/m $$
(13)

where \( {t}_1^i \)is the time t1 of the CNC lathe i. So, the early fault period of the type CNC lathe with less fault data in is (0, 2005.7 h). When we use the early fault period of each CNC lathe in this case, the total number of early faults is 15. When we use the time interval (0, 2005.7 h), the total number is 15. Though the numbers are the same, the early faults for each CNC lathe are different. So, the expectation can be used to assess the CNC lathe with few fault data.

When we determine the early faults, we get the top events of fault trees. In this paper, a fault tree of a determined early fault as an example is shown in Fig. 8 to explain the proposed method, and the notations in Fig. 8 are indicated in Table 5. From Fig. 8, the logical relations of events are all OR gates, so the basic events from X1 to X6 are all possible fault causes, which means that each of these events will cause the top event.

Fig. 8
figure 8

The fault tree of an early fault of CNC lathe

Table 5 Notations and events in Fig. 8

We can get the possible fault cause set {X1, X2, …, X6}. Then, the analysis is performed by an FMECA (formally not an FMECA). The partial result is depicted in Table 6. These results show the RPN ranking of these fault causes.

Table 6 FMECA of CNC machine tool (partial)

In Table 6, which department should examine the fault causes and the recommended elimination measures are also given in the FMECA as shown in Table 6. For instance, the fault cause X1 may be caused by poor quality parts purchased. Therefore, the purchasing department should do a good job in the inspection of incoming parts or materials.

According to the RPN ranking, the maintenance personnel can conduct the troubleshooting more accurate and efficient. More inspection and improvement work need be carried out in the machine tool company based on the FMECA table. In Table 6, we can see that the possible fault causes of the given early fault are mainly from the machine tool company (X1, X2, X3, X6). In addition, the fault cause X2 is critical. Cooperating with the design department of the partner company, we check the locking structure of this type CNC lathe and find that the problem is caused by the design of lathe bed. The design of lathe bed in this type CNC lathe adopts a 45° slant structure which is conducive to remove the smear metal. While the slant structure causes a certain tipping during the movement of the tailstock on the bed rail, and then the smear metal will enter the clearance between the tailstock rail and the bed rail. When the tailstock is locked again, the contour between the tailstock axis and the spindle axis of the CNC lathe changes. Discussed with the designers in the company, locking mechanism is improved as shown in Fig. 9 [49]. Figure 9 a is the previous structure of the locking mechanism, and Fig. 9 b is the improved structure. The improvement successfully solves this problem. The more details for the working principle of the improved structure can refer to Literature [49] published by our partner company.

Fig. 9
figure 9

The previous and improved structure of locking mechanism. a The previous structure of the locking mechanism. b The improved structure of the locking mechanism. ①tailstock ②tailstock guide ③anti-overturning briquetting ④tailstock locking platen ⑤screw nut ⑥disc spring ⑦compacting block ⑧cylindrical pin ⑨bearing ⑩binding screw with hexagon socket cylinder head

After the improvement activities, we also select the after-sale data for four CNC lathes of the same types as the previous ones. The fault data of the improved CNC lathes within the time interval (0, 2005.7 h) can be shown in Table 7. Comparing the fault data before and after improvement, we find that the early faults in Table 1 have been eliminated after improvement. Although the new faults occur, the number of faults has reduced from 15 to 6.

Table 7 Fault data of the improved CNC lathes

The proposed early fault elimination method has been applied in Baoji Machine Tool Group Co., Ltd. in China, and the company is satisfied with the method. Due to confidentiality, we just give an early fault elimination measure that has been published by our cooperation corporation to demonstrate the effectiveness of the proposed method. We believe that our early fault elimination can also be applied in other CNC machine tools and can help the middle-small machine tool companies to improve the reliability of their products with lower cost.

3 Conclusion

In this paper, we have proposed an early fault elimination method of CNC machine tools. It has four steps. Firstly, we need collect the fault data of CNC machine tools. This step is the basis of fault analysis work. The fault data of machine tools are collected from the after-sale data (i.e., the customer field data). Secondly, we build a four-parameter NHPP model to determine the early fault period. The faults in the time interval (0, t1) in Fig. 1 refer to the early faults. Thirdly, we propose a mixed fault analysis method that is the combination of FTA and FMECA method. The early faults determined in the second step are conducted as the top events of fault trees. The possible fault causes are obtained by the FTA method. Then, these fault causes are performed by the FMECA method which is formally not an FMECA because the traditional FMECA is used to analyze the fault mode, while the proposed FMECA is performed to analyze the fault causes. In the FMECA, the risk priority number (RPN) is calculated to assess the criticality of the fault causes. After the fault analysis, the elimination measures are taken based on the analysis results in the end. As a result, the proposed method is applied in Baoji Machine Tool Group Co., Ltd. in China.

Our early fault elimination is a systematic fault analysis method. It has been applied to a machine tool company and proven to be an effective early troubleshooting method. It has some advantages: wide range of applications, lower cost, and more efficiency. The proposed method is not only applicable to the CNC lathe in the case study but also other CNC machine tools. Another meaning of the wide range is that it can be applied in CNC machine tools company with different sizes. For the middle-small, the proposed method can utilize the after-sale data to perform the early fault analysis. Although our original intention is to help the middle-small company, for the company with strong capital, they have the strength to do reliability tests of CNC machine tools. The proposed method is more accurate because of the accurate test data. The reliability tests of CNC machine tools are not essential, so the analysis cost is lower. In addition, the proposed four-parameter NHPP can help analyst determine the early faults from many collected faults, which can save plenty of time and money when performing the FTA and FMECA. Finally, the proposed method can take the advantages of the FTA method and FMECA method. RPN is used to assess the criticality of these fault causes. It can help the designers or maintenance personnel focus on the critical fault causes.

Our early fault elimination method also has some limitations. The biggest limitation is insufficient data. The establishment of the four-parameter NHPP model requires a certain amount of fault data of CNC machine tools. There are many reasons for insufficient reliability data. For instance, some companies do not pay much attention to reliability data collection, the data collected is irregular, etc. Another limitation is that the proposed method needs some professional reliability analysts, while for many middle-small companies, there are few staffs with professional reliability knowledge. In the future study, we will focus on the development of the expert system based on the proposed method. In addition, uncertainty exists in the analysis process, so we will also consider the uncertainty in the next study.