Keywords

1 Introduction

Manufacturing is the foundation of industries. Both conventional and high-tech manufacturing industries use a large quantity of equipment to produce commodities. Under sharp productivity and utilization competitions, most equipment runs 24 h a day. The mechanic wear and tear from incessant usage results in inevitable equipment failures. Mild failures causes defect products, while serious failures result in enormous economic loss, personnel lives loss, and environmental disasters.

In recent years, a number of industrial accidents took place in turns, highlighting the importance of equipment maintenance. Figure 1 shows a fire accident of a petroleum refinery plant. Although investigations and improvements have been followed after each accident, it appears accident does not stop happening, even the inspection period has been continually shortened. Traditional time-based maintenance appears have reached its limit to prevent more accidents [1]. To solve the problem, new condition-based maintenance has been developed, such as the techniques using shock pulse method (SPM) and vibration detection [1, 2]. The condition-based maintenance has shown a great potential to further reduce accidents. However, since enterprises are pressured to cut their costs [1, 3], decision makers hesitate to adopt the new maintenance strategy.

Fig. 1
figure 1

A petroleum refinery plant fire accident

This paper will use system simulation to evaluate various maintenance strategies to compare their effectiveness. Simulation has been well known for its “what-if analysis” capability, and will be used in this paper to provide performance assessments. In addition, statistical procedures will be used to design experiments and compare maintenance strategies to provide results interpretation. The objective of the paper is to a provide industries a convenient way to envision the differences among maintenance strategies and select an appropriate one. We summarize maintenance strategies in Sect. 2. Section 3 presents how to build models for reliability simulation. A blast furnace belt conveyor will be used as an example. Section 4 demonstrates the experiment design and the analysis of simulated results using multiple comparison procedures. Conclusions will be given in Sect. 5.

2 Maintenance Strategies

Due to the serious damage of equipment failures, equipment maintenance has been a critical issue to almost all industries. Over the years, the mind-set of maintenance has evolved from reactive to planned/scheduled, and then to condition-based maintenance [4]. There are four major maintenance strategies, as summarized in the following:

2.1 Reactive or Corrective Maintenance [5]

Reactive maintenance takes a run-to-failure approach. No maintenance is scheduled and no condition assessment is conducted. Correction is done upon failure. The strategy is usually adopted for non-critical equipment, and the reliability can be hardly controlled.

2.2 Time-Based or Preventive Maintenance [3, 6]

The strategy pre-schedules periodic maintenance to do repairs or replacements prior to failures. The maintenance is usually time-based and the frequency is based on historical data. Since failures take place stochastically, time-based maintenance cannot prevent every failure, but to prolong the time spans between failures [7]. That is, failures still occur under this strategy. In response to this problem, the maintenance frequency tends to be shorter and shorter to prolong the time between failures. Hence, the cost of the strategy can be high.

2.3 Predictive Maintenance [6]

Many failures can be monitored and predicted with respect to a certain parameters. Those parameters usually exhibit statistical patterns. Predictive maintenance is to monitor the parameters periodically to issue early warnings of degradation or failure. For instance, predictive maintenance can be based on bearing heat signature, lubricant condition, or rotating vibration. Since the parameters are only monitored periodically, failures can happen in between periods, resulting in a similar dilemma as preventive maintenance.

2.4 Condition-Based or Just-In-Time Maintenance [3, 6]

Similar to predictive maintenance, condition-cased maintenance (CBM) monitors the system conditions that are critical to failures. The monitor is done continuously and in real time such that equipment problems can be detected just-in-time [1]. This strategy is relatively new. Most of the equipment that could use condition-based maintenance is not instrumented with sensors, limiting the adoption of the CBM.

The strategies above have different implementation difficulties [8]: Reactive maintenance has reliability hardly controlled. Either preventive or predictive maintenance has their limits to prevent failures. In contrast, CBM is effective to prevent failures, but the implementation is difficult and expensive. Facing the reliability and safety requirements, enterprises are also increasingly pressured to reduce their costs. How to evaluate maintenance strategies to consider the tradeoff between reliability and cost has been a dilemma to enterprise decision makers [8]. In the next section, we present a scheme of reliability simulation for evaluating maintenance strategies.

3 Reliability Simulation

System simulation is the technique that can abstract the features of a dynamic system by experimenting on its process models. The models mimic system operations and statistical characteristics to reproduce as real as system behaviors. Therefore, simulation is widely used as a tool to analyze what-if decisions. This section demonstrates how to use simulation to evaluate maintenance strategies. To begin with, we introduce a blast furnace belt conveyor for the demonstration purpose.

3.1 Blast Furnace Belt Conveyor

Blast furnace is the equipment for smelting to produce iron. In a blast furnace, raw materials such as ore and flux are continuously supplied through a belt conveyor. The conveyor is nearly 8,000 m long, and about four or five stories high as shown in Fig. 2. Its repair is extremely difficult, especially on the top above the furnace, where the combustion and high temperature result in a hypoxic state. Once the conveyor fails to deliver materials, it will cause the shutdown of the furnace. The cost from this is measured as high as 10,000 NT dollars per minute. And the loss of supply shortage to its downstream industries is immeasurable. The maintenance strategy of the belt conveyor has been a critical decision to the steel smelting industry. We will take the belt conveyor as an example to demonstrate how to build its reliability simulation model, and how to experiment the model to evaluate different maintenance strategies. Figure 3 presents the structure of a belt conveyor, which consists of a number of subsystems, and each subsystem consists of a number of components.

Fig. 2
figure 2

Belt conveyor in a blast furnace

Fig. 3
figure 3

The hierarchical structure of a belt conveyor

3.2 Simulation Model and Input Factors

To build simulation models for the belt conveyor, we use SIMPROCESS [9], which is a hierarchical and integrated process simulator. The model has three hierarchical layers as shown in Fig. 4, corresponding to the structure of Fig. 3. The three layers are main system layer, subsystem layer, and component layer. We describe each layer from the bottom up.

Fig. 4
figure 4

The hierarchical simulation model of blast furnace belt conveyor

The bottom layer is the component layer, which models the life cycles of a component. A life cycle starts from a normal condition. After a random period of time, the component steps in an initial damage state, resulting from operations wearing. The random period of time can be sampled from various probability distributions, with an average known as MTBF (mean time between failures). This paper uses normal distributions to sample the random times. The cycle is followed by another random period of time to step in a failure state. This period of time models the time span from initial damage to breakdown, if no preventive or predictive maintenance is performed. When a component fails, the cycle goes into a repair state. It takes the third period of time to fix the component. This period models the interval consumed by materials preparation and repairs. Once the component is fixed, it resumes a normal condition and a new cycle starts. In addition to the time perspective, the resources consumed in the process and their costs are also built in the model. The resources include the raw materials and maintenance technicians. The crude simulation consists of iterative life cycles. Performance statistics are collected in the simulation and are used to estimate the component reliability and associated costs. More importantly, different maintenance strategies can be embedded in the model to compare their effectiveness. Two specific strategies are constructed in the model: time-based maintenance and condition-based maintenance, which are described in the following.

Time-based maintenance is built in the model as repeated processes in parallel with the component life cycles. Interactions take place between the two cycles in such a way that the maintenance will affect the component life cycles: If a regular-scheduled maintenance runs into an initial damage, a repair is prescribed to restore the component to a normal condition, thus skipping the failure and repair stages. Since a regular maintenance may or may not detect early damages, an input parameter can be assigned to control the probability of successful detection. The maintenance frequency is also an input parameter, which can be optimized by using an optimization module of SIMPROCESS.

In the other way, condition-based maintenance is built within the component life cycle. Since CBM monitors a component continuously, an initial damage can be quickly detected. Just in time replacement or repair can restore the component to a normal condition and skip the failure stage. Since CBM may not 100 % detect an initial damage, an input parameter is also provided to control the probability.

Each component in the model can selectively use CBM strategy, in conjunction with regular time-based maintenance. Various statistics are summarized in simulation to provide decision makers multiple evaluation perspectives.

The middle layer of the simulation model is subsystem layer, as shown in the middle part of Fig. 4. This layer models the interactions among the components of a subsystem. For instance, there are five components in a pulley subsystem: head pulley, bend pulley, drive pulley, head snub pulley, and tail pulley. When a component fails, notices will be sent to all other components and subsystems to hold and wait until the failing component is repaired. This interaction is done by using preemptions; that is, the cycles of all working component will be preempted by a failing component. Similarly, once a failing component is fixed, notices will be sent to resume the process of all preempted component.

The top layer of the simulation model is main system layer, as shown in the top part of Fig. 4. This layer models the interactions among subsystems. There are six subsystems in the model, including a belt subsystem, a feeder subsystem, a roller subsystem, a pulley subsystem, a drive subsystem, and a weighting subsystem. The top layer can drill down to each of the six subsystems, and the subsystem model can drill down to its components. Preemption is again used to implement the interactions.

All the three layers work together to simulate the operation states under time-based or condition-based maintenance strategies. The input factors of the model are imported from an external file for flexibility. There are 12 factors for each component, as listed in the following:

  1. 1.

    mean time from normal state to initial damage

  2. 2.

    mean time from initial damage to failure

  3. 3.

    regular maintenance time period

  4. 4.

    mean time of a regular maintenance

  5. 5.

    mean time of a maintenance if an initial damage is detected

  6. 6.

    probability that an initial damage can be detected in a regular maintenance

  7. 7.

    mean time of a failure repair

  8. 8.

    cost of a failure repair

  9. 9.

    cost of an initial damage fix

  10. 10.

    cost of a regular maintenance

  11. 11.

    probability that an initial damage can be detected by a CBM sensor

  12. 12.

    average lead time to repair a failure

Part of the factors is basic data, and the others can be treated as decision variables and can be optimized. In the next section, we introduce the output of the simulation.

3.3 Key Performance Output

The simulation model runs with different combination of input factors to collect performance statistics. In each replication, the run length is 2 years. Different performance is collected with respect to each layer of the model. There are four perspectives of performance collected as shown in Fig. 5. The first perspective is count statistics, such as failure counts, hold counts, and maintenance counts. The second perspective is cost statistics, such as failure cost, regular maintain cost, and initial damage maintain cost. The third perspective is productivity statistics, such as uptime percentage and average uptime. The fourth perspective is resources’ utilization, such as the utilization of maintain engineers and repair technicians. Each layer of the model collects statistics associated with that layer, so that the performance can be evaluated from both micro and overall angles. By all means, the simulation provides decision makers quantitative statistics to visualize the performance of different strategies.

Fig. 5
figure 5

Key performance output

4 Experiment Design and Multiple Comparisons

The purpose of simulation is to evaluate maintenance strategies. To obtain meaningful interpretations, statistical procedures need to be used in experiment design and results analysis. We make up six maintenance strategies, as listed in Table 1, to demonstrate the comparisons. Dummy data is used in the sense we only intend to demonstrate the evaluation procedure. Strategy A is a purely time-based maintenance; only periodic maintenance is scheduled for each component and subsystem. The other four strategies are various mixtures of time-based and condition-based maintenance; the difference is in the number of components using CBM. Strategy B has only one CBM component, while strategy F has 5 CBM components. The strategies are selected for comparisons from cost and productivity.

Table 1 Maintenance strategies for comparisons

Because of the random nature in simulation, each strategy will run 30 independent replications to get random samples. A conventional one-way model is assumed for the simulated results:

$$ {Y}_{ij}={\theta}_i+{\varepsilon}_{ij},i=1,2,\dots, 6,j=1,2,\dots, 30 $$

where Y ij is the j-th simulated performance of strategy i; θ i is the expected performance of strategy i, and θ can be any performance as listed in Fig. 5; and ε ij is a random noise with an expected value 0 and a common variance. Experiments are conducted according to the design. Part of the numerical results and bar charts with respect to different performance are presented in Fig. 6.

Fig. 6
figure 6

Example simulation result

To analyze whether there are significant differences among the six strategies, three multiple comparison procedures are used: all-pairwise comparisons, multiple comparisons with the best, and multiple comparisons with a control, which are described in the following.

All-pairwise comparison (MCA) [10] constructs the simultaneous confidence intervals for θ i  − θ j , ∀ i ≠ j with an overall error rate controlled. Multiple comparisons with the best (MCB) [11] constructs the simultaneous confidence intervals for θ i  − θ*, ∀ i, where θ* refers to the performance of the best but unknown strategy. Multiple comparisons with a control (MCC) [12] constructs the simultaneous confidence intervals for θ i  − θ A , ∀ i ≠ A. Figures 7, 8, and 9 show the example results respectively.

Fig. 7
figure 7

MCA example results

Fig. 8
figure 8

MCB example results

Fig. 9
figure 9

MCC example results

Decision makers can distinguish among the six strategies either a unique or a group of best strategies by investigating the confidence intervals. For instance, there are C 6,2 = 15 pairwise intervals constructed by MCA, as shown in Fig. 7. Strategy F is clearly the best in that all the cost difference between strategy F and any other strategy has a negative upper confidence limit, implying strategy F cost is significantly lower. And there are 6 confidence intervals constructed by MCB, as shown in Fig. 8. Strategy F can be identified as the best in that the cost difference between strategy F and the best strategy has an upper limit 0. And there are five confidence intervals constructed by MCC, as show in Fig. 9. Clearly the cost of each five condition-based strategies is significantly lower than time-based maintenance. Decision makers can conveniently select a strategy based on these statistics.

5 Conclusion

Equipment maintenance has been a critical issue to manufacturing industries. However, it is difficult for industries to choose an appropriate maintenance strategy. This paper presents a scheme to build reliability simulation models to compare maintenance strategies, from cost and productivity perspectives. A blast furnace belt conveyor is given as a demonstration example. In addition, experiment design and multiple comparison procedures are used to give statistical analysis. It is expected this research can help industries evaluate and select an appropriate maintenance strategy to reduce future equipment failures and associated losses.