Introduction

Manufacturing system maintenance plays a critical role in industrial equipment’s efficient usage in terms of cost, availability and safety. Many effective system maintenance strategies have been developed (Huynh et al. 2012; Wang et al. 2016; Lu et al. 2015). Generally, system maintenance can be classified into corrective maintenance (CM) and preventive maintenance (PM). The corrective maintenance involves the repair or replacement of failed components (Kenne and Nkeungoue 2008). The preventive maintenance is a schedule of maintenance actions aiming at the prevention of system breakdowns and failures (Wang et al. 2015; Zhong and Jin 2014). Recently, condition-based maintenance (CBM) becomes more desirable in many application domains where safety, reliability and availability of the system are considered critically. It has attracted researchers in recent years by aiming to balance the maintenance cost, which is high in PM, with failure cost, which is high in CM. In addition, CBM can also increase productivity, efficiency and availability of systems.

For system maintenance, condition monitoring is becoming popular in industries because of its efficient role in detecting potential failures. The use of condition monitoring techniques can improve system availability and reduce downtime. If a hidden defect is already presented, with the help of condition monitoring, the failure may be identified, and maintenance actions may be taken. For an effective maintenance, advance prediction of such a failure and its development are very important for ordering spare parts and preparing maintenance personnel. Meanwhile, it requires careful plan well before the failure actually occurs.

Predictive maintenance aims to optimize the tradeoff between maintenance costs and performance costs by increasing availability and reliability while eliminating unnecessary maintenance activities. Besides the functions of condition monitoring and failure diagnosis, predictive maintenance also estimates the time of a future failure and residual useful life of the system, which can be called prognostics. Prognostics can provide the current or predicted system health status and this information can be used to initiate maintenance actions. The motivation of this research comes from the complexity of finding optimal predictive maintenance strategies in a system based on diagnostics and prognostics information. This research integrates the predictive maintenance with system degradation and resource constraints, and employs dynamic programming model to find the optimum strategies.

The organization of the paper is as follows:

In “Literature review” section, a brief review of existing literature in the various types of models and algorithms in maintenance optimization is presented and the contribution of the research is clarified. In “System degradation and maintenance description” section, the system degradation and maintenance are described. In “The integrated decision model of system maintenance management” section, a dynamic programming maintenance scheme for integrating predictive maintenance and resource constraints is proposed, and the corresponding solution algorithm is developed. In “Case study” section, a case is analyzed and discussed. Finally, conclusions are drawn in “Conclusions” section.

Literature review

Maintenance optimization problem

Maintenance models are used to find optimal maintenance schedules for a variety of systems. And it has been studied extensively. Fitouhi and Nourelfath (2012) dealt with the problem of integrating non-cyclical PM and tactical production planning for a single machine. Bartholomew-Biggs et al. (2009) considered the optimal PM scheduling and dealt with the problem of scheduling imperfect preventive maintenance for equipment. The above literatures mainly focus on minimizing system cost and identifying the preventive maintenance period. Moreover, the delay-time concept and its modeling techniques can be developed and applied to various industrial equipment maintenances (Christer 1999; Wang 2011). Wang et al. (2000) proposed a stochastic process called gamma process, with hazard rate as its mean for prediction of residual life. The condition information was considered as the expert judgment based on vibration analysis. Marseguerra et al. (2002) studied optimal maintenance solutions for continuously monitored multi-component systems with Markov deteriorating processes. The Monte Carlo simulation was used for the optimization and it was more efficient than the analytical method. Wu et al. (2010) developed an online adaptive condition-based maintenance method for mechanical systems with a concentration on condition monitoring. A multi-component systems approach for condition-based maintenance optimization was applied by Jafari and Makis (2015) where economic dependence between components existed. However, for PM, prognostics and diagnostics information are not considered. That is, the decisions are based on the reliability information obtained from similar systems while the prognostics and diagnostics information are not considered. For CBM, it only uses current component state information. In this paper, the maintenance strategy will consider not only the diagnostics information (or health monitoring information), but also the prognostics information (or system degradation information).

Maintenance and resources optimization problem

Predictive maintenance can provide current or predicted health status of a system and use the prognostics information to initiate maintenance actions. The study of predictive maintenance concerns resource management, maintenance strategy optimization and evaluation. Recently, mathematical models have been established to describe predictive maintenance with consideration of spare parts inventory (Park and Lee 2011; Fitouhi and Nourelfath 2014; Jafari and Makis 2016). Basten et al. (2012) designed an optimal solution algorithm for joint problem of LORA (level of repair analysis) and spare parts stocking. Wang (2012) presented a joint optimization method for both spare parts inventory control and preventive maintenance inspection interval. All these studies entail the joint optimization of predictive maintenance and spare parts inventory. Thus, in current literatures, the predictive maintenance mainly focused on the optimization of spare parts inventory and maintenance strategies is developed only with consideration of system status forecasting. In this paper, the predictive maintenance focused on the optimization of spare parts inventory and maintenance personnel. Maintenance strategies are developed with consideration of diagnostics and prognostics information.

Maintenance and system degradation problem

Usually, the system condition deteriorates with usage and age over time. The degradation will lead to failures and downtime, unless maintenance activities are performed. The reliability evolution of a system depends on its structure and health status. In order to shorten the downtime and reduce maintenance costs, mathematical models were established to describe various degradation processes (Van Horenbeek et al. 2013; Liu et al. 2013; Molavi and Zahiri 2015). Aiming to optimize the condition-based maintenance considering maintenance resources, Wang et al. (2008) presented a condition-based order-replacement policy for a single-unit system. Based on the residual useful life estimation, Rausch and Liao (2010) addressed a joint production and spare parts inventory control strategy driven by condition-based maintenance. The degradation information was utilized to initiate replacement actions in conjunction with spare parts inventory control under both production lot size and due-date constraints. There are some literatures on spare parts optimization using condition monitoring information, without maintenance optimization. Li and Ryan (2011) developed a model that incorporates real-time condition monitoring information into inventory management decisions for spare parts, and used the degradation model to derive the life distribution of a functioning part and estimate the demand distribution for spare parts. The above studies illustrate the joint optimization of condition-based maintenance and spare parts inventory with consideration of degradation information. However, it is known that system degradation includes both deterioration and aging. In the most literatures, aging information is not considered. The aging information has important influences on system health status and predictive maintenance strategy. This paper proposes an integrated dynamic maintenance model using prognostics information (both deterioration and aging) with consideration of resource constraints.

The assumptions and features of the studies reviewed can be summarized in Table 1.

Table 1 Major methods used for maintenance optimization

Research motivation and contributions

These studies are of interest and could be applied in a wide variety of industries such as semiconductor manufacturing, transportation and power generation. However, it can be found that there is a few works in the integration of resources, system degradation and predictive maintenance optimization.

The implementation of maintenance actions may require different resources such as spare parts and maintenance personnel. This paper provides a new method that incorporates prognostics information with available resources to obtain the optimal maintenance strategy. Thus, an integrated decision model for both predictive maintenance and resources is presented. The contributions of the paper can be summarized as follows. First, system degradation information such as deterioration and aging is integrated into the proposed predictive maintenance model. Second, different from the system maintenance with a single failure state, HSMM can be used to classify the different system failure states (prognostic and diagnostic information), obtain the transition probabilities among the failure states, and compute the duration of each health state (Liu et al. 2015). Then different maintenance actions could be developed for each failure state, and each system health state will be optimized to a different expected level. Finally, a dynamic programming model is proposed to solve the integrated decision optimization problem with consideration of multiple failure states and resource constraints. And the overall optimal maintenance strategy is obtained.

System degradation and maintenance description

Degradation description

System degradation includes deterioration and aging. For the system, if there is no any maintenance action, it will not automatically transfer to a better state and will just randomly convert into a worse state. This phenomenon is denoted as deterioration (see Fig. 1).

In this paper, the upper triangular transition probability matrix is used to describe the deterioration as follows:

$$\begin{aligned} P(h_t =j|h_{t-1} =i)=a_{ij} =0,\quad \forall j<i \end{aligned}$$
(1)

where, it represents that the system health state stays in j at time t and in i at time \(t-1\).

The system status will become more aging with the increase of its working time and the failure risks will correspondingly increase. Then, the system will transfer to a worse health state and it is denoted as aging phenomenon.

The previous HSMM based health management assumes that the transition probabilities are only state-dependent, which means that the probability of making transition to a less healthy state does not increase with the age. In order to characterize the system aging, an aging factor that discounts the probabilities of staying at current state while increasing the probabilities of transitions to less healthy states are integrated into the HSMM. With an iteration algorithm, the original transition matrix obtained from the HSMM can be renewed with an aging factor. Although the system health state will be changed by adopting maintenance actions, health states obtained by the HSMM with aging factor are still the best match for the real system health states. With the classification information obtained from the HSMM with aging factor, the current system health states can be identified and transition probabilities can be generated.

Fig. 1
figure 1

Illustration of the deterioration transition probability

Fig. 2
figure 2

Transition probability of degradation. a Decrement mode. b Bell mode

Based on system deterioration and aging, the common degradation mode can be divided into two kinds: one is called decrement mode and the other is bell mode (see Fig. 2). The decrement mode indicates that the transition probabilities among health states will decrease progressively and the system will degrade gradually with the growth of time. The bell mode says that the transition probabilities among several initial health states will increase and the transition probabilities among other health states will decrease gradually.

Maintenance description

Maintenance actions

Various maintenance actions can be adopted to slow down the system degradation. In this paper, the system existing age is introduced into the model to show the effect of maintenance actions. The system lifetime will increase with time, and the system existing age will decrease by adopting maintenance actions. Thus, the system existing age is smaller than the system life after maintenances. Let \(D(h_{i})\) denote the duration of a system staying at state \(h_{i}\). Once a system enters health state \(h_{i}\), its existing age will equal to the summation of the existing useful duration of a system staying at health state \(h_{i}\) and the existing useful duration of a system staying in the health states before health state \(h_{i}\).

Let \(\bar{D} (h_i ,n)\) denote as the existing duration of the system at n-th stage since it has entered state \(h_{i}\). Then,

$$\begin{aligned}&D(h_i )=\mu (h_i )+\rho \sigma ^{2}(h_i ) \end{aligned}$$
(2)
$$\begin{aligned}&\rho =\left( T-\sum _{i=1}^N {\mu (h_i )}\right) /\sum _{i=1}^N {\sigma ^{2}(h_i )} \end{aligned}$$
(3)
$$\begin{aligned}&\bar{D} (h_i ,n)=(1-a_{ii} )D(h_i ) \end{aligned}$$
(4)

where, \(\mu (h_{i})\) is the mean of duration probability of health state \(h_{i}\) and \(\sigma ^{2}(h_{i})\) is the variance of duration probability of health state \(h_{i}\). Tdescribes life time of a system, and \(T={\sum }_{i=1}^N {D(h_i )} \).

Thus, based on Eqs. (2)–(4), the system existing age \(t_{n}\) at n-th stage since the system has entered state \(h_{i}\) can be computed as follows:

$$\begin{aligned} t_n =\left\{ {{\begin{array}{ll} \bar{D} \left( {h_i ,n} \right) &{}\quad i=1 \\ \bar{D} \left( {h_i ,n} \right) +{\sum }_{j=1}^{i-1} D\left( {h_j } \right) &{}\quad i>1 \\ \end{array} }} \right. \end{aligned}$$
(5)

If the system health state is \(h_{i}\), then the available maintenance actions set can be expressed as follows:

$$\begin{aligned} { PM}_{i}= & {} \{{ PM}_{ii}, { PM}_{{ Re}}\}\nonumber \\&\cup \,\{{ PM}_{ij}{\vert }1\le j \le i-1\},\quad i=2, 3, {\ldots }, L-1 \end{aligned}$$
(6)

where

\({ PM}_{ii}\):

Adopt some minor non-replacement maintenance actions such as lubrication, adjustment and cleaning. The purpose is to keep the system staying at the current health state \(h_{i}\) as long as possible.

\({ PM}_{ij}\):

Adopt some imperfect non-replacement maintenance actions such as repairing the internal loss of parts. The maintenance goal is to restore the system from current health state \(h_{i}\) to a better health state \(h_{j}\), and then the system existing age will become smaller.

\({ PM}_{{ Re}}\):

Adopt replacement maintenance actions by directly using new parts to replace the old ones. The system health state will be restored to the initial health state \(h_{1}\).

Maintenance time

For different system health states, various maintenance actions can be adopted to optimize the system to different states and objectives. System maintenance and replacement time cannot be ignored. The maintenance time is illustrated in Fig. 3. Although the maintenance actions are able to ensure high system reliability, the frequent maintenance actions could also decrease the system utilization, which can be expressed as follows:

$$\begin{aligned} Au= & {} \frac{{ Total}\_{ uptime}}{{ Total}_{{ uptime}} +{ Total}\_{ downtime}} \nonumber \\= & {} \frac{\sum T_i }{\sum T_i +\sum T_{{ PM}1} +\sum T_{{ PM}2} +\sum T_{{ Re}} } \end{aligned}$$
(7)

where \({ Au}\) represents the system utilization. \(T_i\) is the working time between two maintenance actions. \(T_{{ PM}1} \) is the time required by a minor maintenance action \({ PM}_{ii}\). \(T_{{ PM}2} \) is the time required by an imperfect maintenance action \({ PM}_{ij}\). \(T_{{ Re}} \) is the time required by a replacement maintenance action \({ PM}_{{ Re}}\).

Fig. 3
figure 3

Illustration of maintenance time

Maintenance risk

In this paper, different maintenance actions can be adopted to decrease the system degradation, including non-replacement maintenance actions and replacement maintenance action. Besides replacement, non-replacement maintenance actions may have the risk of not making the improvement to the expected maintenance level. For example, when the system health state stays in \(h_{i}\), the maintenance action \({ PM}_{ij}\) can be adopted and the target is to optimize the system to a better state \(h_{j}\). However, the actual health state \(h_{k}\) after performing maintenance action \({ PM}_{ij}\) may not be \(h_{j}\), and it may be a worse state than \(h_{j}\), and even it may generate a failure (i.e. the worst maintenance, \(h_{k}=h_{L})\).

Moreover, the system will appear aging phenomenon with the increase of lifetime, and the maintenance risk will also increase. The actual maintenance effect will be further away from the expected level. In this paper, r is used to denote the maintenance risk.

The integrated decision model of system maintenance management

Notations

The following notation will be used throughout this paper:

\(C_{Sy}\)

System cost

\(C_{M}\)

System maintenance cost

\(C_{F}\)

System failure cost

\(C_{f}\)

Failure independent cost

\(C_{o}\)

Failure dependent cost

\(C_{Av}\)

Resources cost

\(C_{s}\)

Spare parts shortage cost

\(C_{p}\)

Personnel cost

\({ CA}\)

Downtime cost

\(c_{s}\)

Spare parts shortage cost per unit

F

Maintenance stages

\(x_{n}\)

Initial inventory level at the n-th maintenance stage

y

Total personnel quantity

\(c_{p}\)

Penalty coefficient when the personnel is missing

\(c_{f,i}\)

Failure independent cost when the system health state is in i

\(c_{o}(i\), \(t_{n})\)

Failure dependent cost when the system health state is in iat time\( t_{n}\)

\(P_{ji}(t_{n})\)

Transition probability from health state j to state i when the existing age is \(t_{n}\)

\(c_{m,i}\)

Maintenance independent cost when the system health state is in i

\(c_{M}(a_{n}{\vert }i)\)

Maintenance dependent cost generated by adopting maintenance action \(a_{n}\) when the health state is in i

\(r(a_{n}{\vert }i)\)

Spare parts demand quantity generated by adopting the maintenance action \(a_{n}\) when the health state is in i

\(p(a_{n}{\vert }i)\)

Personnel demand quantity generated by adopting the maintenance action \(a_{n}\) when the health state is in i

\(C_{T}(n\), \(F-n\), \(t_{n}\), \(a{\vert }i)\)

Expected total cost corresponding to the future \(F-n\) stages by the maintenance action a

C(n, \(F-n\), \(t_{n}\), \(a{\vert }i)\)

Expected total system cost and resources cost corresponding to the future \(F-n\) stages by maintenance action a

T(n, F-n, \(t_{n}\), \(a{\vert }i)\)

Expected total maintenance time corresponding to the future \(F-n\) stages by maintenance action a

\(C_{Av}(y_{n}\), \(a_{n}\), \(b_{n}{\vert }i)\)

Resources cost at the n-th maintenance stage while the health state is in i

b

Transition probability shape factor (i.e. degradation)

d

Downtime factor (i.e. downtime cost coefficient)

r

Maintenance risk factor

Maintenance management

In this paper, a novel model is proposed for the system maintenance. Some basic descriptions for the maintenance model are described as follows.

  1. (1)

    The system health state \(h_{i} \in [1, 2, {\ldots }, L]\) can be obtained from the HSMM diagnosis and prognosis model (Liu et al. 2015).

  2. (2)

    The system will transfer to a worse health state with the increase of the system lifetime.

  3. (3)

    When the system health state is \(h_{i}\), the available maintenance actions are as follows:

    $$\begin{aligned} { PM}_{i}= & {} \{{ PM}_{ii}, { PM}_{{ Re}}\}\\&\cup \, \{{ PM}_{ij}{\vert }1\le j \le i-1\},\quad i=2, 3, {\ldots }, L-1 \end{aligned}$$

    When \(i=L\), non-replacement maintenance actions cannot be adopted. The only available maintenance action is to replace the current component of the system with a new one.

  4. (4)

    The replacement cost \(C_{{ Re}}\) and replacement time \(T_{{ Re}}\) will not change with the system health state. The cost and time for non-replacement maintenance actions are \(C_{{ PM}}\) and \(T_{{ PM}}\), respectively.

  5. (5)

    The failure cost \(C_{F}\) is a non-decrease function corresponding to health state \(h_{i}\) and system existing age\( t_{n}\).

Cost model

The costs directly related to system maintenance actions consist of the replacement cost (corresponding to \({ PM}_{{ Re}})\), the minor maintenance cost (corresponding to \({ PM}_{ii})\) and the imperfect maintenance cost (corresponding to \({ PM}_{ij})\), which can be denoted as maintenance cost \((C_{M})\). The costs indirectly related to system maintenance actions include failure cost \((C_{F})\) and reserved resources cost \((C_{Av})\). By adopting maintenance actions, it can slow down the system deterioration and aging. Thus, the system failure cost caused by the system degradation can be reduced. This illustrates that the system failure cost is related with the system maintenance and is a part of the total cost. On the other hand, in order to implement the maintenance actions, different resources are needed. Therefore, in the total cost, resources cost is also considered in the maintenance model. The failure cost is related to the system health state while the resources cost is related to the maintenance actions.

Thus, the total objective \(C_{{ total}}\) includes system cost \(C_{Sy}\) (including maintenance cost \(C_{M}\) and failure cost \(C_{F})\), resources cost \(C_{Av}\) and downtime cost CA. Downtime cost is the production or application loss of the system as a result of performing maintenance. Moreover, the downtime cost can increase due to insufficient spares, unavailable personnel etc. The CA is related to the maintenance action time \(T_{M}\) and downtime cost coefficient d. Hence, the total objective function of maintenance model considering both cost and system utilization can be obtained.

$$\begin{aligned} { Min C}_{{ Total}}= & {} C_{Sy}+C_{Av}+{ CA=C}_{M}+C_{F}+C_{Av}\nonumber \\&+\,T_{M}\times d \end{aligned}$$
(8)

System cost

Reducing the operating-support cost and increasing the safety are the major interests of industries. The maintenance of a system directly affects the availability and production rate. It causes two kinds of costs: failure cost and maintenance cost. The failure cost often is larger than the maintenance cost because it may involve in identification of failure, insufficient spare parts at the failure time, and unavailable personnel due to unplanned nature of the failure. However, if too many maintenance actions are performed, it will lead to increase maintenance cost. Hence, the system cost is defined as the summation of maintenance cost and failure cost as follows.

$$\begin{aligned} C_{Sy}=C_{F}+C_{M } \end{aligned}$$
(9)

Failure cost consists of failure independent cost \((C_{f})\) and failure dependent cost (i.e., operation cost \(C_{o})\). Failure independent cost can be defined as the fixed cost due to any failure (cool down, diagnostics, disassemble, re-assemble, start-up, and warm up), and failure dependent cost is the repairing and replacement cost of failed components. It can be expressed as follows:

$$\begin{aligned} C_F =C_f +C_o =c_{f,i} +\sum \limits _{j=1}^i {P_{ji} (t_n )c_o (i,t_n )} \end{aligned}$$
(10)

where \(c_{f,i}\) denotes the failure independent cost when the system health state is in i. \(c_{o}(i\), \(t_{n})\) is the failure dependent cost when the system health state is in i at \(t_{n}. P_{ji}(t_{n})\) is the transition probability from health state j to state i when the system existing age is \(t_{n}\).

Similar to failure cost, maintenance cost consists of maintenance independent cost and maintenance dependent cost. Maintenance independent cost can be defined as the fixed cost due to any maintenance (stop the system, cool down, disassemble, re-assemble, start-up, and warm up), and maintenance dependent cost is the cost of the specific maintenance actions. It can be expressed as follows:

$$\begin{aligned} C_M =c_{m,i} \delta \left( {a_n } \right) +c_M \left( {a_n |i} \right) \end{aligned}$$
(11)

where \(c_{m,i}\) denotes the maintenance independent cost when the system health state is in i. If maintenance action \(a_{n}\) is adopted, then \(\delta (a_{n})\) is 1. Otherwise, \(\delta (a_{n})\) is 0. \(c_{M}(a_{n}{\vert }i)\) describes the maintenance dependent cost generated by adopting maintenance action \(a_{n}\) when the system health state is in i.

Based on Eqs. (8)–(10), the total system cost can be obtained as follows.

$$\begin{aligned} C_{Sy}= & {} C_M +C_F =c_{m,i} \delta \left( {a_n } \right) +c_M \left( {a_n |i} \right) +c_{f,i}\nonumber \\&+\sum \limits _{j=1}^i P_{ji} \left( {t_n } \right) c_o \left( {i, t_n } \right) \end{aligned}$$
(12)

Resources cost

In this paper, resources for maintenance implementation include spare parts and personnel. The maintenance strategy will be applicable only if the required resources are available at the time of the maintenance. It will cause more downtime costs if the maintenance cannot be performed due to lack of resources. So it is important to incorporate available resource information into the maintenance model. The resources cost consists of spare parts shortage cost \((C_{s})\) and personnel cost \((C_{p})\). And they can be written as follows:

$$\begin{aligned} C_{Av}=C_{s} +C_{p} \end{aligned}$$
(13)

The spare parts shortage can affect the implementation of maintenance actions. Furthermore, the production processes might be affected since the system cannot be maintained timely due to the spare parts shortage. Therefore, the spare parts shortage could become a part of the total cost model in order to avoid the high shortage penalty. Let \(c_{s}\) be the spare parts shortage cost per unit, then the shortage cost can be obtained as follows:

$$\begin{aligned} C_s =\sum \limits _{j=1}^i P_{ji} \left( {t_n } \right) { max}\left\{ {0,c_s \left( {r\left( {a_n |i} \right) -x_n } \right) } \right\} \end{aligned}$$
(14)

where, \(x_{n}\) denotes the initial inventory level at the n-th maintenance stage. \(r(a_{n}{\vert }i)\) describes the spare parts demand quantity generated by adopting the maintenance action \(a_{n}\) when the system health state is in i.

In the proposed model, if the required personnel constraint is not satisfied, then a distinct penalty cost will be defined for each health state, which is calculated as the product of the missing personnel number and a penalty coefficient. Let \(c_{p}\) be the penalty coefficient, y denote the total personnel quantity and \(p(a_{n}{\vert }i)\) describe the personnel demand quantity generated by adopting the maintenance action \(a_{n}\) when the system health state is in i. Then the personnel cost can be written as follows:

$$\begin{aligned} C_p =\sum \limits _{j=1}^i {P_{ji} (t_n )\max \{0,c_p (p(a_n|i)-y)\}} \end{aligned}$$
(15)

Based on Eqs. (13)–(15), the available resources cost can be obtained.

$$\begin{aligned} C_{Av} =C_s +C_p= & {} \sum \limits _{j=1}^i P_{ji} \left( {t_n } \right) \left[ { max}\left\{ {0,c_s \left( {r\left( {a_n |i} \right) -x_n } \right) } \right\} \right. \nonumber \\&\quad \left. +\,{ max}\left\{ {0,c_p \left( {p\left( {a_n |i} \right) -y} \right) } \right\} \right] \end{aligned}$$
(16)

The dynamic programming model

In this paper, we assume that the system maintenance has F maintenance stages. The model aims to obtain the optimal maintenance strategies in the future F maintenance stages. So it can be transformed into an F-stage decision problem. For each stage, the optimal maintenance strategy is obtained and the optimal maintenance action is chosen. Thus, the optimal strategy of each stage composes the overall optimal strategy.

Based on Eq. (6), the optimal decision is to choose the optimal maintenance action. The results of the available maintenance action set are as follows (Liu et al. 2013):

\({ PM}_{ii}\):

The system stays at the current state after the minor non-replacement maintenance, the system existing age is reduced to \({\sum }_{k=1,2...,i-1} {t_{(k)} } +E\left\lfloor \lambda (t_n -{\sum }_{k=1,2...,i-1} {t_{(k)} } )+0.5 \right\rfloor \).

\({ PM}_{ij}\):

The system will restore to a better health state after imperfect maintenance, and the system existing age is reduced to \({\sum }_{k=1,2...,j-1} {t_{(k)} } +E\left\lfloor \lambda (t_n -{\sum }_{k=1,2...,j-1} {t_{(k)} } )+0.5 \right\rfloor \).

\({ PM}_{{ Re}}\):

The system health state will restore to \(h_{1}\) after replacing the old parts by new ones, and the system existing age is reduced to 0.

Here, \(t_{(k)}\) represents the duration of a system staying at health state \(h_{k}\), and its value can be obtained based on Eqs. (2) and (3). \(E\left\lfloor x \right\rfloor \) is the floor function and \(E\left\lfloor {x+0.5} \right\rfloor \) will round x to the nearest integer. \(\lambda \) is the maintenance utility function and \(\lambda \in \)[0, 1]. When \(\lambda \) equals 0, the maintenance action can obtain the best result. When \(\lambda \) equals 1, the maintenance action has no effect, and the system existing age will remain unchanged.

Thus, the objective function for F maintenance stages can be obtained as follows.

$$\begin{aligned}&{} { Min} C_{T}(n, F-n, t_{n}, a{\vert }i)\nonumber \\&\quad =C(n, F-n, t_{n}, a{\vert }i)+T(n, F-n, t_{n}, a{\vert }i)\times d \end{aligned}$$
(17)

where, a is the maintenance action adopted at the n-th maintenance stage. \(t_{n}\) is the system existing age at the n-th maintenance stage. \(C_{T}(n, F-n, t_{n}, a{\vert }i)\) is the total expected cost corresponding to the future \(F-n \)stages by the maintenance action a. \(C(n, F-n, t_{n}, a{\vert }i)\) is the total expected system cost and resources cost corresponding to the future \(F-n\) stages by maintenance action a. \(T(n, F-\hbox {n}, t_{n}, a{\vert }i)\) is the total expected maintenance time corresponding to the future \(F-n\) stages by maintenance action a.

Corresponding to the F maintenance stages, let \(C_{Av}(y_{n}\), \(a_{n}\), \(b_{n}{\vert }i)\) denote the resources cost at the n-th maintenance stage while system health state is in i. Then, after n stages, the expected total resources cost can be obtained as follows:

$$\begin{aligned}&C_{Av} \left( {n,F-n,y,a_n ,x_n |i} \right) =C_{Av} \left( {y,a_n ,x_n |i} \right) \nonumber \\&\qquad +\sum \limits _{j=i,i+1,\ldots ,L} P_{ij} \left( {t_n } \right) \nonumber \\&\qquad \left\{ {C_{Av} \left( {n+1,F-n-1,y,a_{n+1} ,x_{n+1} |j} \right) ^{*}} \right\} \end{aligned}$$
(18)

Based on Eqs. (16) and (18), the total expected resources cost at the n-th maintenance stage can be obtained as follows.

$$\begin{aligned}&C_{Av} \left( {n,F-n,y,a_n ,x_n |i} \right) \nonumber \\&\quad =\sum \limits _{j=1}^i P_{ji} \left( {t_n } \right) \left[ { max}\left\{ {0,c_s \left( {r\left( {a_n |i} \right) -x_n } \right) } \right\} \right. \nonumber \\&\left. \qquad +\,{ max}\left\{ {0,c_p \left( {p\left( {a_n |i} \right) -y} \right) } \right\} \right] \nonumber \\&\qquad +\sum \limits _{j=i,i+1,\ldots ,L} P_{ij} \left( {t_n } \right) \left\{ C_{Av} ( n+1,F-n-1, x_n \right. \nonumber \\&\qquad \left. -\,r\left( {a_n |i} \right) ,y,a_{n+1} ,x_{n+1} |j)^{*} \right\} \end{aligned}$$
(19)

For maintenance action a, the expected total cost in the future \(F-n \)stages can be described as follows:

$$\begin{aligned}&\hbox {C}(n,F-n,t_n ,a | i)\nonumber \\&\quad =C_M (a)+C_F (i,t_n )+C_{Av} (i,a,t_n ) \nonumber \\&\qquad +\sum \limits _{j=i,i+1...,L} {P_{ij} (t_n )\{} C(n+1,F-n-1,t_n +1,a^{\prime } |j)*\}\nonumber \\ \end{aligned}$$
(20)

The total expected cost at the n-th maintenance stage consists of two parts. One is the summation of failure cost, maintenance cost and resources cost at the n-th stage, and the other is the total expected cost of all stages after the \(n+\)1-th maintenance stage.

For maintenance action a, the total expected downtime in the future \(F-n \)stages can be described as follows:

$$\begin{aligned}&T(n,F-n,t_n ,a | i)=t(a)\nonumber \\&\quad +\sum \limits _{j=i,i+1...,L} P_{ij} (t_n )\{ C(n+1,F-n-1,t_n +1,a^{\prime } |j)*\}\nonumber \\ \end{aligned}$$
(21)

The total expected downtime at the n-th maintenance stage consists of two parts. One is the maintenance actions time at the n-th maintenance stage, and the other is the total expected maintenance time of all stages after the \(n+\)1-th maintenance stage.

For \(a={ PM}_{ii}\) (i.e. minor maintenance), the total expected cost and downtime can be obtained as follows.

$$\begin{aligned}&C\left( {n,F - n,{t_n},a/i} \right) = {C_M}\left( {P{M_{ii}}} \right) + {C_F}\left( {i,t} \right) \nonumber \\&\qquad +\,{C_{Av}}\left( {n,F - n,P{M_{ii}},{x_n}/i} \right) \nonumber \\&\qquad + \sum \limits _{j = i}^L {r^{j - i}}\left( {1 - r} \right) \left\{ {C{{\left( {n,F - n,\,\,{t^{\prime }} + 1,{a^{\prime }}/j} \right) }^*}} \right\} \nonumber \\&\qquad +\,{r^{L - i}}\left\{ {C{{\left( {n,F - n,t^{\prime } + 1,P{M_{jL}}/L} \right) }^*}} \right\} \end{aligned}$$
(22)
$$\begin{aligned}&T\left( {n,F - n,{t_n},a/i} \right) = t\left( {P{M_{ii}}} \right) \nonumber \\&\qquad + \sum \limits _{j = i}^L {r^{j - i}}\left( {1 - r} \right) \left\{ {T{{\left( {n,F - n,t^{\prime } + 1,a^{\prime }/j} \right) }^*}} \right\} \nonumber \\&\qquad +\,{r^{L - i}}\left\{ {T{{\left( {n,F - n,t^{\prime } + 1,P{M_{jL}}/L} \right) }^*}} \right\} \end{aligned}$$
(23)

\(a={ PM}_{ij}\) (i.e. imperfective maintenance)

$$\begin{aligned}&C\left( {n,F - n,{t_n},a/i} \right) = {C_M}\left( {P{M_{ij}}} \right) + {C_F}\left( {j,{t^*}} \right) \nonumber \\&\qquad +\, {C_{Av}}\left( {n,F - n,P{M_{ij}},{x_n}/i} \right) \nonumber \\&\qquad + \sum \limits _{q = j}^L {r^{q - j}}\left( {1 - r} \right) \left\{ {C{{\left( {n,F - n,{t^*} + 1,a^{\prime }/q} \right) }^*}} \right\} \nonumber \\&\qquad +\,{r^{L - i}}\left\{ {C{{\left( {n,F - n,{t^*} + 1,P{M_{qL}}/L} \right) }^*}} \right\} \end{aligned}$$
(24)
$$\begin{aligned}&T\left( {n,F - n,{t_n},a/i} \right) = t\left( {P{M_{ij}}} \right) \nonumber \\&\qquad + \sum \limits _{q = j}^L {r^{q - j}}\left( {1 - r} \right) \left\{ {T{{\left( {n,F - n,{t^*} + 1,a^{\prime }/q} \right) }^*}} \right\} \nonumber \\&\qquad +\,{r^{L - i}}\left\{ {T{{\left( {n,F - n,{t^*} + 1,P{M_{qL}}/L} \right) }^*}} \right\} \end{aligned}$$
(25)

\(a={ PM}_{{ Re}}\) (i.e. replacement)

$$\begin{aligned}&C\left( {n,F - n,{t_n},a/i} \right) = {C_M}\left( {P{M_{Re}}} \right) + {C_F}\left( {1,0} \right) \nonumber \\&\qquad +\,{C_{Av}}\left( {n,F - n,P{M_{Re}},{x_n}/i} \right) \nonumber \\&\qquad + \sum \limits _{j = i}^L {P_{1j}}\left( 0 \right) \left\{ {C{{\left( {n + 1,F - n - 1,0,a^{\prime }/j} \right) }^*}} \right\} \end{aligned}$$
(26)
$$\begin{aligned}&T\left( {n,F - n,{t_n},a/i} \right) = t\left( {P{M_{Re}}} \right) \nonumber \\&\qquad + \sum \limits _{j \in H} {P_{1j}}\left( 0 \right) \left\{ {T{{\left( {n + 1,F - n - 1,0,a^{\prime }/j} \right) }^*}} \right\} \end{aligned}$$
(27)

where,

$$\begin{aligned} t^{\prime }= & {} \sum \limits _{k = 1,2, \ldots ,i - 1} {t_{\left( k \right) }} +\,E\left( {\lambda \left( {{t_n} - \sum \limits _{k = 1,2, \ldots ,i - 1} {t_{\left( k \right) }}} \right) + 0.5} \right) \\ {t^*}= & {} \sum \limits _{k = 1,2, \ldots ,j - 1} {t_{\left( k \right) }}+\,E\left( {\lambda \left( {{t_n} - \sum \limits _{k = 1,2, \ldots ,j - 1} {t_{\left( k \right) }}} \right) + 0.5} \right) \end{aligned}$$

The overall framework of the dynamic programming maintenance model is shown in Fig. 4.

Fig. 4
figure 4

The framework of the integrated dynamic maintenance model

Fig. 5
figure 5

Schematic diagram of the experiment setup

A heuristic algorithm can be used to obtain the optimal maintenance strategy a(n, \(F-n\), \(t_{n}{\vert }i)^{*}\). When the system existing age is \(t_{n}\) and its health state is in i at the n-th maintenance stage, heuristic algorithm can be developed to solve the maintenance model and obtain the optimal maintenance strategy set. The detailed algorithm is described as follows:

figure a

Here, \(a(F-m, m, t_{F-m}{} { {\vert }i})^{*}\) denotes the optimal maintenance strategy corresponding to the system existing age \(t_{F-m}\) and health state i. \(A(F-m, m, t_{F-m})^{*}\) describes the optimal maintenance strategy set corresponding to the system existing age \(t_{F-m}\) for all health states. \(A(F-m\), \(m)^{*}\) is the optimal maintenance strategy set at the \(F-n-m\)-th maintenance stage.

Case study

In the following, a case is studied to validate the proposed integrated dynamic maintenance model with system prognostics information and resources constraints.

Experimental setup and data acquisition

In this case study, the long-term wear test experiments were conducted at a research laboratory facility (Shanghai Pangyuan Machinery Co.). In the test experiments, three pumps (A, B and C) were worn by running them using oil containing dust. Each pump experienced four states: Level 1 (normal state), Level 2 (degradation state), Level 3 (degradation state), and Level 4 (failure state). The degradation stages in this hydraulic pump wear test case study correspond to different stages of flow loss in the pumps. As the flow rate of a pump clearly indicates pump’s health state, the degradation stages corresponding to different degrees of flow loss in a pump were defined as the health states of the pump in the test (Liu et al. 2015).

The vibration signals were collected from a pump accelerometer that was positioned parallel to the axis of swash plate swivel axis and data was continuously sampled. Figure 5 shows the schematic diagram of the experimental setup. The pump used for testing in the experiments was a Back Hoe Loader: a 74 cc/rev variable displacement pump. The data was collected at a sample rate of 60 kHz with anti-aliasing filters from accelerometers which were designed to have a usable range of 10 kHz. In many cases, the most distinguished information is hidden in the frequency content of signals. So the time-frequency representation of signals is needed. In this case study, the signals were processed by using wavelet packet with Daubechies wavelet 10 (db10) and five decomposition levels as the db10 wavelet provided the most effective way to capture the fault information in the pump vibration data. The coefficients obtained by the wavelet packet decomposition were used as the inputs.

Data preparation and processing

  1. (1)

    Health state transition probability

In this case study, the matrix of pump health state transition probability can be obtained from the previous research work (Liu et al. 2015). There are four health states in this case study, including one normal state, two degradation states and one failure state, which are defined as Level 1, Level 2, Level 3 and Level 4 (L), respectively. The matrix of health state transition probability is shown as follows.

$$\begin{aligned} \Phi _t =\left[ {{\begin{array}{ccccc} {1-b^{^{\prime }}(1+\beta _2 )^{t}{\sum }_{j=2}^n {P_{1j}^0 } }&{} {P_{12}^0 (1+\beta _2 )^{t}b^{^{\prime }}}&{} \ldots &{} \ldots &{} {P_{1F}^0 (1+\beta _2 )^{t}b^{^{\prime }}} \\ 0&{} {1-b^{^{\prime }}(1+\beta _2 )^{t}{\sum }_{j=3}^n {P_{2j}^0 } }&{} \ldots &{} \ldots &{} {P_{2F}^0 (1+\beta _2 )^{t}b^{^{\prime }}} \\ \vdots &{} \ldots &{} \ldots &{} \ddots &{} \vdots \\ 0&{} &{} \ldots &{} {1-b^{^{\prime }}(1+\beta _2 )^{t}P_{n-1,F}^0 }&{} {P_{n-1,F}^0 (1+\beta _2 )^{t}b^{^{\prime }}} \\ 0&{} \ldots &{} \ldots &{} 0&{} {P_{FF} } \\ \end{array} }} \right] \end{aligned}$$

where, \(\beta _{2}\) is aging factor (here, \(\beta _{2}=0.18\)), b is a transition probability shape factor \((b^{^{\prime }}=1-b, 0\le b <1)\). When b equals 0, it shows that the system degradation is in the basic decrement mode. The initial transition probabilities among four health states are given in Table 2. The duration \(t_{(k)}\) of a system staying at health state \(h_{k}\) is given in Table 3.

Table 2 The initial health state transition matrix
Table 3 Expected duration for each health state
  1. (2)

    Maintenance action cost and time

Table 4 Maintenance actions cost
Table 5 Maintenance actions time

The maintenance cost and time of various maintenance actions are given in Tables 4 and 5, respectively.

  1. (3)

    Failure cost

The failure cost is related to the system health state and its lifetime. Then the failure independent cost can be obtained as follows:

$$\begin{aligned} c_{f,i} =\left\{ {{\begin{array}{ll} 150 &{}\quad i=1 \\ 200 &{}\quad i=2 \\ 250 &{}\quad i=3 \\ 300 &{}\quad i=4 \\ \end{array} }} \right. \end{aligned}$$

The failure dependent cost function can be described as follows:

$$\begin{aligned} c_o (i,t)=\left\{ {{\begin{array}{ll} 120i+90t&{} \quad i=1 \\ 80i+93t&{} \quad i=2 \\ 70i+95t&{} \quad i=3 \\ \infty &{} \quad i=4 \\ \end{array} }} \right. \end{aligned}$$
  1. (4)

    Resource cost

Table 6 Demand quantities of spare parts
Table 7 Demand quantities of personnel
Table 8 Periodical maintenance strategies

The demand quantities of spare parts and personnel are determined by the current system health state and the adopted maintenance actions, which are shown in Tables 6 and 7, respectively.

  1. (5)

    Maintenance risk

In this paper, maintenance risk factor r is used to describe the maintenance action risk. With the growth ofr, the maintenance action risk will increase. Thus, two conditions can be considered (i.e. smaller maintenance risk and larger maintenance risk).

When the maintenance risk is smaller, it can be shown as follows:

$$\begin{aligned} r(t)=\left\{ {{\begin{array}{ll} {0.05}&{} \quad {1\le t<20} \\ {0.05+0.02(t-20)}&{} \quad {20\le t<45} \\ {0.5}&{} \quad {45\le t} \\ \end{array} }} \right. \end{aligned}$$

When the maintenance risk is larger, it can be shown as follows:

$$\begin{aligned} r(t)=\left\{ {{\begin{array}{ll} {0.2}&{} \quad {1\le t<20} \\ {0.2+0.03(t-20)}&{} \quad {20\le t<45} \\ {0.6}&{} \quad {45\le t} \\ \end{array} }} \right. \end{aligned}$$

The values of parameters related to resources and other parameters are given as follows:

$$\begin{aligned} c_{s}= & {} 1200, c_{p}=1300, x_{1}=8, y=2, \\ \lambda= & {} 0.6, d=1000, F=12 \end{aligned}$$

Maintenance strategy comparisons

Nowadays, periodic maintenance, defined as significant activities carried out regularly to maintain the condition or operational status of the system, is a common maintenance strategy. The periodic maintenance includes periodic inspections, periodic repairs and preventive maintenance. In this paper, periodic maintenance aims to obtain the optimal maintenance strategies in one life-cycle of the system. And the one life-cycle of the system can be divided into 4 states based on system health states obtained by the work (Liu et al. 2015). For each stage, there is a time interval between two maintenance actions, in which the optimal maintenance strategy can be adopted.

Based on system health states, when the current stage is staying at the health state Level 1, minor non-replacement maintenance actions are adopted. This will keep the system stay at the current health Level 1 as long as possible. If the current stage is staying at the health state Level 2 or Level 3, imperfect non-replacement maintenance actions are adopted and this will restore the system from current health state Level 2 or Level 3 to a better health state. When the current stage is staying at the health state Level 4, replacement maintenance actions by directly using new parts to replace old ones are adopted. This will restore the system from current health state Level 4 to a better health state. The detailed periodic maintenance strategies are shown in Table 8.

For the purpose of simulating different conditions, three major parameters are chosen: shape factor b, downtime factor d and maintenance risk factor r. The three parameters can be differently combined to illustrate the performance of the proposed methods, and two values can be chosen corresponding to each parameter. They can be shown in Table 9.

Table 9 Parameter condition

Based on Table 9, eight combination strategies can be simulated, and the combination strategies can be shown in Table 10.

Table 10 Parameter combination strategy
Table 11 Comparison of simulation results for different maintenance strategies

Discussion and performance evaluation

Result analysis

In this case study, for each combination, both periodical maintenance and integrated maintenance strategy are carried out 10 simulations. For each simulation, the two maintenance strategies run 600 monitoring cycles to obtain the optimal solution, respectively. The maintenance strategy evaluation criteria include average unit cost c and average system life t. The results are provided in Table 11.

Table 12 Comparison of simulation results for different maintenance strategies

It can be seen from the 8 combinations in Table 11 that the integrated dynamic maintenance strategy has a better performance than the periodical maintenance strategy in terms of extending system life, increasing system utilization and decreasing unit cost. For all combinations of integrated dynamic maintenance strategy, combination 1 is the best one. Due to the active maintenance actions of the integrated dynamic maintenance strategy, the system residence time in a better state is longer than that of the periodical maintenance strategy. Thus, the average life cycle of a system adopting dynamic programming maintenance strategy is also longer than that of the periodical maintenance strategy.

For the integrated maintenance strategy, from Table 12, it can be seen that the periodical maintenance increases the system average life, but it has a long downtime and very high maintenance cost. The target of the integrated dynamic maintenance strategy is not only to increase system life, but also to choose the optimal maintenance strategy based on the system health states so that the system utilization is improved and the average unit cost is decreased. Compared with the no-maintenance method, the integrated maintenance strategy increases the utilization by 1.39% and decrease the average unit cost by 26.91%. Compared with the periodical maintenance, the integrated dynamic maintenance strategy increases the system utilization by 5.91% and decrease the average unit cost by 17.88%. Compared with the pure dynamic maintenance strategy without considering the constraints, it can increase the system utilization by 3.6% and decrease the average unit cost by 8.6%.

Figures 6 and 7 describe the periodical maintenance activities and the integrated maintenance activities within one life-cycle of the system, respectively. In Figs. 6, 7 and 8, y-axis denotes the maintenance actions, where \(1={ PM}_{11}\), \(2={ PM}_{21}\), \(3={ PM}_{22}\), \(4={ PM}_{31}\), \(5={ PM}_{32}\), \(6={ PM}_{33}\), \(7={ PM}_{{ Re}}\). And x-axis represents the system life. From Fig. 6, it can be seen that the periodical maintenance still implements a large number of maintenance actions even when the system health state enters Level 3 and the system life exceeds 45. Although these maintenance actions increase the system life, they increase both downtime cost and maintenance cost.

Fig. 6
figure 6

The periodical maintenance strategy in one life-cycle of the system

Fig. 7
figure 7

The integrated maintenance strategy in one life-cycle of the system

Fig. 8
figure 8

The relationships between the integrated maintenance and resource

From Fig. 7, it can be seen that the integrated maintenance strategy implements a great deal of maintenance actions such as \({ PM}_{11}\) and \({ PM}_{21}\) at the system initial working state. This keeps the system at a better health state for a long time and generates a lower failure cost. With the increase of working time, once the system enters health state Level 3, even if the maintenance action \({ PM}_{31}\) is adopted to restore the system health state from Level 3 to Level 1, its health state will return back to Level 3 quickly. Figure 7 indicates that the integrated maintenance doesn’t implement a large number of maintenance actions once the system enters Level 3. Instead, a replacement maintenance action is adopted directly.

Fig. 9
figure 9

Maintenance actions corresponding to different shape factor b

Resource analysis

The relationships between the integrated dynamic maintenance strategy and resources within one system life-cycle need to be analyzed. For combination 1 of the integrated dynamic maintenance strategy, the results are shown in Fig. 8. It can be seen from Fig. 8, due to the limit of spare parts level, when the system health state enters Level 2, the optimal maintenance strategy does not choose to implement the maintenance action \({ PM}_{21 }\)immediately to restore the health state from level 2 to Level 1. With the growth of working time, the maintenance actions \({ PM}_{21}\) and \({ PM}_{31 }\)are adopted by the optimal maintenance strategy, and it has no effect on system utilization. Moreover, the required resources are always available in the process of the optimal maintenance, including spare parts and personnel.

Strategy dynamic analysis

The maintenance strategy can be affected by three main parameters: transition probability shape factor b, downtime factor d and maintenance risk factor r.

  1. (1)

    The impact of degradation on the integrated dynamic maintenance decision

For combinations 1 and 2, they have same downtime, but with different degradation. Corresponding to different b, the maintenance actions are shown in Fig. 9.

With the increase of b, the degradation mode gradually transits from decrement mode to bell mode. In the decrement mode, the system stays at the current state with the greatest probability. In the bell mode, the probability of the system remaining at the current state becomes small, and the system is more inclined to the less healthy state. Thus, replacement maintenance actions may be needed to slow down the system aging.

The different degradation modes generate different maintenance strategies. From Fig. 9, it can be seen that the integrated dynamic maintenance strategy can be automatically adjusted to accommodate different degradation modes.

  1. (2)

    The impact of downtime on the integrated dynamic maintenance decision

For combinations 1 and 3, they have same degradation, but with different downtime. Corresponding to different d, the maintenance actions are shown in Fig. 10.

Fig. 10
figure 10

Maintenance actions corresponding to different downtime factor d

Fig. 11
figure 11

Maintenance actions corresponding to different maintenance risk r

With different values of the downtime factor d, the system has different utilizations and downtime loss. Compared with other maintenance actions, the replacement action requires a longer time and may cause more downtime losses. With the increase of d, many non-replacement maintenance actions need to be adopted to reduce the number of replacement maintenance actions and avoid that the system transits into a worst state.

  1. (3)

    The impact of maintenance risk on the integrated dynamic maintenance decision

Non-replacement maintenance actions have a certain amount of maintenance risk. With the increase of maintenance risk, the actual maintenance effect is further away from the target results. For combination 1 and combination 5, they have same degradation and downtime, and different maintenance risk. Thus, corresponding to different r, the maintenance actions can be shown in Fig. 11.

Table 13 The results of sensitive analysis

With the increase of maintenance risk, the system needs to increase the improvement of maintenance actions to decrease the maintenance risk generated by adopting maintenance actions, and transform maintenance actions. It can be seen from Fig. 11 that the integrated dynamic maintenance strategy can adjust maintenance actions corresponding to different maintenance risk.

Based on the above analysis, the three main parameters are connected by maintenance actions. Moreover, with the change of different parameters, minor maintenance action \(({ PM}_{ii})\) and imperfect maintenance action \(({ PM}_{ij})\) often need to be adopted in order to reduce the number of replacement, the degradation mode can be changed, and the system needs to transform maintenance actions. Thus, for different parameters, different maintenance actions can be adopted to stay the system in a better state and adapt the change of different parameters by the integrated model. It can be concluded that the integrated dynamic maintenance strategy has a good dynamic performance for b, d and r. Moreover, it can obtain the optimal maintenance strategy and maintenance actions based on different parameters.

Sensitivity analysis

In the experiment described above, the proposed models are solved by a heuristic algorithm. In order to analyze the sensitivity of the proposed methods, its convergence, dynamic performance and timeliness stability are developed as performance evaluation indexes, and downtime factor, maintenance stages, health states, transition probability shape factor and maintenance risk factor are selected as major factors. The orthogonal test method is used to carry out this test, and the range analysis method is used to determine the impact of factors on the performance of a heuristic algorithm (Annibale et al. 2015). And the results in Table 13 show that the range sequence of transition probability shape factor, downtime factor and maintenance risk factor always occupies the top three, thus, they are the most sensitive factors affecting the comprehensive performance of the proposed models. The number of the health states and maintenance stages has lower sensitivity and they don’t influence on the efficiency of the algorithm and the proposed models. Moreover, the performance of the heuristic algorithm is more sensitive in the three parameters.

Based on the sensitivity analysis, the proposed method and the heuristic algorithm have good sensitivity, and three parameters are fully considered to solve the integrated maintenance problem, including the transition probability shape factor, downtime factor and maintenance risk factor.

Conclusions

This paper emphasizes the need of a maintenance optimization method using prognostic information and resource planning. For many industries, on one hand, the unavailability of resources is a major problem. On the other hand, the system deterioration and aging information from prognosis models are not utilized well in current literatures. Therefore, an effective framework to this problem is the integrated optimization of predictive maintenance and resource planning. From the managerial insights of the model, system actual deterioration and aging information are integrated, and the spare parts and actual maintenance personnel also are considered. In this paper, a dynamic programming model is proposed for this integrated maintenance optimization problem. And the corresponding solution algorithm is developed. Finally, a case is studied to validate the proposed methods. From the experiment results, the comparisons between the proposed model with the common periodical maintenance strategy show several benefits of the integrated dynamic maintenance strategy, and it indicates that the proposed method is effective for the system maintenance by using prognostic information.

The long-term wear test experiments are conducted at a research laboratory facility of Shanghai Pangyuan Machinery Co.. Industrial implementation and demonstration of the newly proposed methods in a real factory environment remains to be doing, and the primary effect can be obtained. The further application needs to be done in the future.

Furthermore, a number of interesting directions for further research can be followed based on the ideas proposed in this study. The prognostics with missing of degradation data caused by failure of data transmission or manipulation errors could be considered. And based on the proposed methods, maintenance scheduling models with prognostic information could be developed. Finally, extension of the methods proposed in this paper to service systems is another opportunity for further enhancing the benefits of maintenance operations through the usage of predictive condition information.