A review of diagnostic and prognostic capabilities and best practices for manufacturing

Vogl, Gregory W.; Weiss, Brian A.; Helu, Moneer

doi:10.1007/s10845-016-1228-8

A review of diagnostic and prognostic capabilities and best practices for manufacturing

Published: 09 June 2016

Volume 30, pages 79–95, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

A review of diagnostic and prognostic capabilities and best practices for manufacturing

Download PDF

Gregory W. Vogl¹,
Brian A. Weiss¹ &
Moneer Helu¹

7215 Accesses
183 Citations
3 Altmetric
Explore all metrics

Abstract

Prognostics and health management (PHM) technologies reduce time and costs for maintenance of products or processes through efficient and cost-effective diagnostic and prognostic activities. PHM systems use real-time and historical state information of subsystems and components to provide actionable information, enabling intelligent decision-making for improved performance, safety, reliability, and maintainability. However, PHM is still an emerging field, and much of the published work has been either too exploratory or too limited in scope. Future smart manufacturing systems will require PHM capabilities that overcome current challenges, while meeting future needs based on best practices, for implementation of diagnostics and prognostics. This paper reviews the challenges, needs, methods, and best practices for PHM within manufacturing systems. This includes PHM system development of numerous areas highlighted by diagnostics, prognostics, dependability analysis, data management, and business. Based on current capabilities, PHM systems are shown to benefit from open-system architectures, cost-benefit analyses, method verification and validation, and standards.

A Framework for Prognostics and Health Management Applications toward Smart Manufacturing Systems

Article 23 August 2018

Risk Analysis and Prognostics and Health Management for Smart Manufacturing

Manufacturing Paradigm-Oriented PHM Methodologies for Cyber-Physical Systems

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The future of manufacturing is full of possibilities to utilize real-time and historical data to comprehensively manage maintenance, in order to decrease product lifecycle costs while increasing system availability. Currently, U.S. manufacturers spend more than $7B per year recalling and renewing over 2000 defective products, and the associated costs are only increasing (Venkatasubramanian 2005). Besides increasing costs of maintenance, manufacturing systems can also become more complicated to manage due to the increasing breadth of system information. A typical manufacturing system yields a vast amount of data produced by thousands of sensors that can record position, velocity, flow, temperature, and other physical quantities multiple times every minute (Moyne and Tilbury 2007). Non-physical information related to part specifications, parts ordering, and maintenance schedules for each machine also feed the information stream. In addition to frequently collected and/or shared data, large amounts of diagnostic data, such as spindle current data collected at 1 kHz, are also sent infrequently over manufacturing networks and used for high-level control, such as tool replacement (Moyne and Tilbury 2007). Manufacturing processes are becoming more complex and dynamic, so the reliability of such systems is likewise becoming more challenging (Lee et al. 2011).

The goal of maintenance is to preserve system and product functions throughout their lifecycles. Most product maintenance is either completely reactive or blindly preventative (Djurdjanovic et al. 2003). The oldest maintenance strategy is to “fix it when it breaks” (reactive maintenance), which has problems including unscheduled downtime, possible serious safety violations, and potentially significant damage to manufacturing equipment and the products being fabricated or assembled. The next natural step is to monitor and maintain a system in pre-established time intervals (preventative maintenance), which tends to be cost prohibitive (Kothamasu et al. 2006). The development of reliability engineering in the 1950s led to the introduction of time-based maintenance (TBM) based on the increase of failure with time (Takata et al. 2004). Then, the development of machine diagnostic techniques in the 1970s led to the concept of condition-based maintenance (CBM), in which preventive action is based upon detected symptoms of failures.

The only way to minimize the probability of failure, downtime, and maintenance costs is with CBM, motivating the use of prognostics (Kothamasu et al. 2006). The cost of unplanned downtime can be up to $250K per hour in the process industry, so CBM can enhance profitability by eliminating unpredicted failures (Koochaki et al. 2011). Yet, sometimes neither TBM or CBM is the optimal maintenance strategy; allowing an element to breakdown may be the best option (Takata et al. 2004).

Figure 1 outlines the historical maintenance paradigms along with the evolution of production paradigms summarized by Jovane et al. (2003). Considerable overlap occurs between the two paradigm evolutions because of their interconnectedness; maintenance is a critical part of production. The revenue-based motivations for maintenance improvements are linked to the consumer-driven motivations for production, with all paradigms enabled via technological advancements. As seen in Fig. 1, a possible next step in the evolution of maintenance may be “self-maintenance” or “proactive maintenance”, in which systems monitor themselves, being driven by the fast-paced and hyper-flexible production of the future.

Prognostics and health management (PHM)

Production systems must be easily upgradeable so new technologies can be integrated to meet highly dynamic market demands (Pereira and Carro 2007). Traditional manufacturing should be reexamined to meet the current and future needs for efficient and reconfigurable production. One enabler of this vision is the use of real-time state information of subsystems and components to support maintenance decisions within manufacturing systems. In general, such a vision is the core of prognostics and health management (PHM). The goal of PHM technology is to provide decision support; that is, actionable information to support decision making (Kalgren et al. 2007).

PHM goes beyond CBM because correct predictions of the future may allow one to avoid failure and large disturbances (Lee et al. 2011). PHM incorporates aspects of logistics, safety, reliability, mission criticality, and economic viability among others (Saxena et al. 2010). PHM of components or systems involves both diagnostics and prognostics: Diagnostics is the process of detection and isolation of faults or failures, while prognostics is the process of prediction of the future state or remaining useful life (RUL) based on current or historic conditions (Ly et al. 2009). Prognostics is based on the understanding that equipment fails after a period of degradation, which if measured, can be used to prevent system breakdown and minimize operation costs (Tian et al. 2012). Essentially, PHM is a methodology for the evaluation of the reliability of a system in order to predict and mitigate failures (Sun et al. 2010). Prognostics also enables the reduction of the lead time for procurement and planning for maintenance while furthering the possibility of autonomic logistics (Banks and Merenich 2007). Improvements in maintenance efficiency from system-wide PHM could reduce maintenance-related labor costs by more than 10 % compared to costs for reactive maintenance (Barajas and Srinivasa 2008).

Figure 2 shows a flowchart of the general process of PHM system development. The process begins with cost and dependability analyses to determine the components to monitor. The data management system is then initialized for collection, processing, visualization, and archiving of the maintenance data. Once the measurement techniques are established, the diagnostic and prognostic approaches are developed and tested to ensure that the desired goals are achieved. Finally, personnel are trained during the iterative process of system validation and verification before final system deployment.

Consequently, PHM has emerged as a key enabler for efficient system-level maintenance and lower life-cycle costs (Ly et al. 2009). Towards this end, this paper discusses the challenges, needs, methods, system examples, frameworks, and best practices for PHM within manufacturing systems. The key aspects of the PHM development process for both products and processes are touched upon; the main focus of this discussion is on diagnostics and prognostics. Conclusions are then drawn to aid the growth and effectiveness of PHM within manufacturing systems.

Current PHM challenges and future needs

An important step to improve PHM is to understand the challenges and needs related to the process in Fig. 2, as discussed in this section.

Diagnostics

Sensors enable the simple collection of data, but these devices must still provide the right information at the right time for fault detection and avoidance. In general, a fault is defined as the departure of an observed variable from an acceptable range, and the fundamental cause of this abnormality is called the basic event or the root cause (Venkatasubramanian 2005). Diagnosis of faults requires system (or component) usage information and a diagnostic search strategy that can match the observed symptoms and the known set of possible failures. Furthermore, robust diagnostics are needed for incipient fault detection so that plant breakdowns are avoided.

The ability to diagnose component faults in their infancy is currently limited, due in part to sufficiently large sensitivity to signal noise, dependence on environmental and operating conditions, lack of fault detection (Patrick et al. 2009), and uncertainties in maintenance schedules. Consequently, most manufacturing operations are reactive so maintenance diagnosis is mainly a specialized process, as in the automotive manufacturing industry (Barajas and Srinivasa 2008).

Diagnostics are vital for successful prognostics because an acceptable prognostic method starts with robust diagnostics, since the uncertainties of the estimated system condition affect any future prediction (Hess et al. 2005; Patrick et al. 2009). Diagnostic challenges exist due to problems with verification and validation. Examples of diagnostic failures include the F/A-18C^{Footnote 1} and V-22 (two military aircraft), which had false alarm rates of about 90 % (Shannon and Knecht 2010). The reason for such high false alarm rates is one major challenge for PHM systems: systems may be complex and simple models are inadequate. Consequently, diagnostic and prognostic methods must be capable of dealing with uncertainties. If left unaddressed, these uncertainties can lead to high false alarms, inaccurate predictions, and hence incorrect decisions (Hess et al. 2005).

Prognostics

Prognostics is even more challenging than diagnostics, which is one main reason why prognosis is an underdeveloped element of PHM systems (Patrick et al. 2009). Some failures are intermittent and hence difficult to predict (Sun et al. 2010). Hence, there is still no universally accepted methodology for prognostics (Lee et al. 2011). Despite being a very challenging part of PHM, prognostics is also one of its most beneficial aspects (Hess 2002).

Prognostics is still an emerging field, and much of the published work has been exploratory in nature. Current prognostics technology is considered to be immature due to the lack of uncertainty calculations, validation and verification methods, and risk assessment for PHM system development (Saxena et al. 2010). The lack of standards is due in part to varied end-user requirements, time scales, available information, and system dynamics (Saxena et al. 2010). To help remedy this situation, Saxena et al. (2010) presented new evaluation metrics for the evaluation of prognostic algorithms. Methods are needed to quantify the accuracy of prognostic assessment technologies (Sun et al. 2010). Design tools are also needed to aid the selection of approaches for monitoring mechanical or electrical systems (United States Department of Defense 2008).

As previously stated, prognostics is essentially a condition-based estimation of RUL to make better informed maintenance decisions. The RUL is a prediction of the time or cycles before the functioning of a product or process reaches an unacceptable threshold. Without a corresponding measure of uncertainty, the RUL has little value (Engel et al. 2000; Sandborn and Wilkinson 2007). Hence, prognosis is the “recognized Achilles’ heel” of PHM (Ly et al. 2009). In fact, few PHM methods produce continuous real-time estimation of the RUL (Patrick et al. 2009) and improved methodologies are needed for RUL prediction based on physical and other measurements (United States Department of Defense 2008). For manufacturing and other systems, being complex systems with perhaps thousands of sub-systems and various operational conditions, predicting the reliability and performance is even more difficult (Lee et al. 2011). A large process plant may track as many as 1500 process variables that can be recorded every second, leading to information overload.

Because the prediction of an unknown future naturally involves uncertainty, prognostics must be treated as a probabilistic process in which the predicted RUL is represented by a probability density function (PDF). This PDF is then used to inform a maintainer based on the desired lead time for maintenance operations. However, tools are needed for PHM designers to know how PHM systems impact the total logistic system (Hess et al. 2005); the U.S. Department of Defense (DoD) demands more integrated diagnostic and prognostic capabilities to support maintenance and logistic decisions (Kalgren et al. 2007).

Perhaps the main challenge for prognostics is that there will always be a limit to the accuracy and precision of condition-based estimation of RUL due to the inherent uncertainty of predicting the future. Failure mechanisms have a certain amount of physical randomness, which adds to the inherent error in the prognostics process due to imperfections of sensor data, preprocessing, and feature extraction and failure prognostic methods (Engel et al. 2000). In fact, prognostics may not be feasible due to the highly unpredictable nature of a failure mode (Roemer et al. 2001). Multiple failures may also complicate the RUL prediction (Engel et al. 2000).

Due to the challenges of prognostics within systems, the greatest need for maintenance is for there to be ‘no surprises.’ As long as there is adequate foresight to potential problems, those problems can be mitigated with some planning (Hess et al. 2004).

Components and PHM architecture

Of course, diagnostics and prognostics would be impossible without data from sources such as sensors and programmable logic controllers (PLCs) of machine tools and robotic elements. Data is essential for PHM systems: acquisition and communication of consistent, clean, and reliable data commonly entails about 90 % of PHM system development (Barajas and Srinivasa 2008). In fact, major challenges for the development of the F-35 (a military fighter jet), also known as the Joint Strike Fighter (JSF), were due largely to “limited capability in the software when delivered” and the “need to fix problems and retest multiple software versions” (Charette 2014). The lack of consistent data, communication, and security is a bottleneck for the full realization of PHM across plant-floor operations (Barajas and Srinivasa 2008).

Many manufacturing systems are controlled by PLCs, which are, by nature, discrete-event systems characterized by event-driven inputs and outputs. A critical need for these applications in manufacturing is the integration of PLC information and PHM capabilities (Wu and Hsieh 2012). Most modern manufacturing machines are ‘smart’ in the sense that they have sensors and computerized components that create much data, but most data is not used due in part to limited access or knowledge for usage (Djurdjanovic et al. 2003). For electronics, most outcomes are binary (e.g., pass/fail) that can be used for detection of incipient faults and prediction of failure (Kalgren et al. 2007).

Of particular concern is that real systems often have inconsistent fault messages, making automated fault diagnostics ineffective (Wu and Hsieh 2012). In the late 1970s, “Cannot Duplicate” (CND) problems with electronics could account for more than 85 % of failures in avionics and, consequently, more than 90 % of all maintenance costs. Yet PHM can help with addressing CND and “Retest OK” (RTOK) problems through prognostic models (Sun et al. 2010). In the auto industry, many product malfunctions are due to unanticipated interactions from repeated use, or misuse, of components (Venkatasubramanian 2005).

A significant motivation exists to develop PHM systems to deal with abnormal events in complex systems, considered by some product industry members to be the next major challenge in control systems research (Venkatasubramanian 2005). This challenge for future PHM systems requires the prediction of system-wide functional failures instead of just isolated component-level failures (Roemer et al. 2001). The need exists for manufacturing data to be systematically integrated, managed, and analyzed during the entire life cycle for increased availability within the manufacturing industry. Manufacturing interoperability standards, such as MTConnect (MTConnect Institute 2015), can help record PLC signals for such data management purposes (Lee et al. 2013).

In general, current limitations of PHM do not appear to include sensor type; sensors usually exist to measure needed physical states. For example, General Motors successfully applied diagnostics at their manufacturing plants through the use of motor monitoring, high speed video, infrared thermography, laser alignment, lubrication and oil analysis, ultrasound, and vibration spectrum analysis (Barajas and Srinivasa 2008). Nonetheless, challenges exist for electronic components of aircraft, automobiles, and other products as well as manufacturing systems. PHM has not been traditionally applied to electronic systems as for mechanical systems because the time to failure (TTF) for electronic systems was assumed to be non-quantifiable or longer than the needed period for system usage (Sandborn and Wilkinson 2007). Degradation in electronics is more difficult to detect due to its scale and complexity (Janasak and Beshears 2007). In fact, electronic prognostics “is still in its infancy” due to component complexity and limited knowledge about failure precursors (Sun et al. 2010). Fundamental research into incipient fault detection, physics of failure modeling, and fault to failure progression for electronics are yielding results for the DoD (Kalgren et al. 2007). Nonetheless, PHM needs for electronics include improved integration of sensing and processing modules for in situ PHM through the use of built-in tests, fuses and canaries, monitoring and reasoning of failure precursors, the correlation of in situ loads with physics-based stress and damage models (Vichare and Pecht 2006), and verification and validation (V&V); that is, tests and measurements to prove the performance of prognostics systems (Kalgren et al. 2007).

Table 1 PHM challenges

Full size table

Business level

In a manufacturing plant, preventive diagnostics is common and occurs when at least one key performance indicator (KPI) degrades below an acceptable threshold. However, preventative maintenance is estimated to be applied unnecessarily up to 50 % of the time in manufacturing. In fact, in automotive manufacturing, yearly maintenance costs are on the order of several billion dollars, primarily due to labor costs (Barajas and Srinivasa 2008). PHM could help reduce these costs by using historical and real-time data to provide decision support before, during, or after KPI degradation.

One business-related challenge for PHM is that maintenance is usually regarded as a net cost, not a net benefit (Takata et al. 2004). One reason for this is the difficulty to quantify the cost savings due to PHM, the return on investment (ROI) (Sun et al. 2010). Hence, even though PHM has the potential for creating a paradigm shift in industries like manufacturing, little consideration has been given to PHM as a significant enabler for business (Grubic et al. 2009).

The PHM customer is particularly concerned with the return on investment (ROI) for instituting a PHM system (Banks and Merenich 2007). However, the typical engineer is not trained to address this concern (Banks and Merenich 2007). Nonetheless, an ROI analysis can be used to optimize maintenance, in order to choose between prognostic or more traditional maintenance approaches (Feldman et al. 2008). Veldman et al. (2011) noted that CBM decision-making can be more effective with an increase in the use of failure data, for algorithm optimization, and an improvement in the relationship between controls engineering and maintenance engineering.

Human factors

In addition to the technical and business challenges related to PHM, the human element is also a challenge. Organizational barriers include resistance due to culture, norms, expertise, and customer and supplier relationships (Grubic et al. 2009). Similarly, the DoD stated that a main challenge for PHM relates to “resistance that is often found in an organization” (United States Department of Defense 2008). Such resistance can stem from an employee’s lack of context and direction and emotional reaction (United States Department of Defense 2008).

Another important, but less addressed, challenge is the creation of user-friendly PHM applications (Ahmad and Kamaruddin 2012). PHM monitoring systems tend to be complex in volume and substance, easily overwhelming the user and leading to mistrust in the PHM application in the event of a false alarm or missed hit (Kothamasu et al. 2006). PHM systems are created and/or trained by humans and are hence imperfect; humans may be aware of important information related to system degradation that is invisible to the PHM system.

Consequently, another challenge is how to incorporate the vast knowledge of industrial faults into PHM systems because such knowledge is not necessarily consistent nor precise with respect to uncertainties (Wu and Hsieh 2012). Future PHM systems should be developed with the ability to incorporate other sources of data or information, such as subjective information from personnel with sufficient expertise. Technicians and engineers learn from previous mistakes during maintenance and can detect abnormal conditions of machines by sense. This knowledge is a valuable asset to the company and therefore should be leveraged (Ahmad and Kamaruddin 2012). Data is useless unless it is processed and understood with context by the right personnel (Lee et al. 2013).

Summary of challenges

Table 1 summarizes several of the key challenges to be overcome to enable the future of PHM within smart manufacturing systems. As seen in Table 1, challenges include real-time diagnostic and prognostic methods, standards for PHM system evaluation, and the integration of data (from sensors, PLC, experts, etc.) within user-friendly PHM systems.

PHM methods

PHM research has focused on the analysis of sensor data for fault diagnosis and failure prognosis, the establishment of condition metrics, seeded fault testing, and incipient failure detection (Ly et al. 2009). Reviews exist containing details of such methods for diagnostics (Kothamasu et al. 2006), machine prognostics (Peng et al. 2010), and data-driven prognostics (Schwabacher 2005). This paper does not seek to elaborate upon such technical details, but rather seeks to focus upon the capabilities and best practices of PHM for manufacturing.

Diagnostic and prognostic methods

PHM approaches based on experience, physics or models, statistics, or data all have pros and cons. Experience-based PHM uses human expertise for analysis and is the least complex, but remains highly labor intensive and expensive (Barajas and Srinivasa 2008). Data-driven approaches create non-linear relationships between inputs and outputs without physical models, but are not necessarily convergent. One advantage of data-driven methods is that they can be applied at any level: system, subsystem, or component level (Sun et al. 2010). Prognostics algorithms that use a data-driven approach learn models directly from the data, rather than use hand-built models based on human expertise (Schwabacher 2005). Data-driven methods are based on machine learning and statistical pattern recognition (Sun et al. 2010). Machine-learning techniques include artificial neural networks (ANNs), fuzzy logic, support vector machine (SVM), and hidden Markov models (Sun et al. 2010), while statistical techniques are based on parametric or non-parametric methods. However, statistics-based approaches (reliability-centered maintenance, Bayesian, etc.) generally ignore correlations among various data (Barajas and Srinivasa 2008). Data-driven methods often yield a fault “model” based on neural networks, expert systems, etc. that must be trained with data representing anticipated faults, which may be difficult to validate.

Physics-based prognostics is the most comprehensive modeling approach, utilizing various inputs but can yield an RUL distribution as a function of component or usage uncertainties (Roemer et al. 2001). However, physics- or model-based approaches (involving the solution of ordinary differential equations) usually do not account for the combination of analog and discrete processes (Barajas and Srinivasa 2008). Similarly, physics-of-failure-based (PoF-based) methods combine actual operational conditions with PoF models to calculate the accumulated damage as well as predict the RUL of the product (Sun et al. 2010). The advantage of PoF-based methods is the ability to isolate the root cause and failure mechanisms. However, sufficient information about a product (e.g., operational conditions) and its failure mechanisms are required by skilled personnel to apply such methods. Another disadvantage is that PoF-based methods are not quite suitable for the system and subsystem levels (Sun et al. 2010).

Hybrid/fusion methods

Different methods should be used for diagnostics or prognostics based on their effectiveness. One novel prognostics approach is to fuse various methods into a composite solution. Fusion methods, such as Bayesian or “best of breed” methods, may be effective in many cases to yield reliable just-in-time RUL predictions (Engel et al. 2000). Venkatasubramanian (2005) concluded that no single method is adequate to handle all the requirements for a desirable diagnostic system. Consequently, a framework was developed called Dkit, in which various diagnostic methods analyze the same problem and a scheduler uses these results in a hybrid fashion for better decision-making (Venkatasubramanian 2005).

Other hybrid or fusion methods exist for leveraging the various diagnostics and prognostics methods. Ly et al. (2009) utilized a hybrid method that combines both physics-based and data-driven techniques for prognosis. In that instance, principal component analysis (PCA) was used to fuse multiple features to create a single condition metric for system health. Model-based reasoning (MBR) can also be used to combine advanced diagnostic methods with prognostic analyses. MBR algorithms can analyze multiple failure conditions while differentiating between normal and detrimental changes in system condition, thus enabling more robust prognostics (Engel et al. 2000).

Cost and dependability methods

Other methods help PHM developers to determine the optimum strategy based on costs, benefits, and risks. The common method for justifying PHM is usually based on reliability centered maintenance, which is based on the failure modes and risks, the technical feasibility of incipient failure detection, and the financial justification of PHM (Koochaki et al. 2011). Failure mode, effects, and criticality analysis (FMECA), fault tree analysis (FTA), and other dependability methods can be used to assess the safety, availability, and other metrics for PHM justification. Based on a high-level analysis for the automotive industry, Barajas and Srinivasa (2008) stated that the best return on investment (ROI) is achieved through predictive maintenance, in contrast to reactive or preventive maintenance.

The potential cost savings for prognostic algorithms is usually not easy to obtain. Consequently, Drummond and Yang (2008) developed a simple method to determine the range of costs (false alarms and missed failures) and failure rates over which a prognostic algorithm would be useful. Thus, even without exact cost estimates, PHM implementers would have confidence in the use of a prognostic algorithm. Many critical inputs are uncertain, so accommodating the cost uncertainties of PHM is needed for realistic ROI calculations. For example, Feldman et al. (2008) conducted a stochastic, Monte Carlo simulation for socket maintenance of a Boeing 737, yielding an average ROI of about 3.5 (Feldman et al. 2008). The two major factors for ROI analysis are the implementation costs (recurring, non-recurring, or infrastructural) and avoided costs (the changes to availability, reliability, maintainability, and failure avoidance) due to PHM application (Feldman et al. 2008).

Tian et al. (2012) developed an optimization method for PHM based on physical programming to deal with the two main optimization objectives in PHM, namely the cost objective and the reliability objective, to determine the optimal replacement policy. This optimal risk methodology is used in the CBM software EXAKT, which has been used successfully in manufacturing and other industries to reduce maintenance costs up to 49 % per failure mode (Oliver Interactive Inc. 2014).

Banks and Merenich (2007) developed a general guideline for conducting a cost-benefit analysis (CBA) for PHM. Researchers at the Applied Research Laboratory (ARL) at The Pennsylvania State University created a software tool called the ARL Trade Space Visualizer (ATSV), which allows a user to explore a multi-dimensional space for complex system optimization (Banks and Merenich 2007). The user can evaluate the data through ‘what if’ scenarios, e.g., ‘What is the minimum failure rate to achieve an ROI of 8?’. Banks and Merenich 2007 applied the ATSV for a CBA of battery prognostics for military ground combat vehicles.

Another method is based on a change in business perspective: To help establish PHM as a revenue generator, Grubic et al. (2009) proposed that the true business reason for adopting PHM is as a product-service system (PSS), in which the emphasis shifts from selling a product to selling the use of a product. A machine tool life cycle simulator called MATHS was used to show an almost 13 % increase in availability for a PSS-based factory (Grubic et al. 2009).

Figure 3 summarizes the relationship of the PHM methods mentioned in this section. As seen in the figure, the key abilities of these methods should influence the future of PHM within discrete manufacturing systems.

Methods for discrete manufacturing systems

Many diagnostic and prognostics methods deal with continuous data or digital data but not the combination thereof. However, manufacturing systems are described as dynamic systems with states that change by discrete events (e.g., parameter changes) or continuous events (e.g., gradual performance degradation). And in general, faults affect systems through both continuous and discrete dynamics as well as their interactions (Koutsoukos et al. 2001).

Some diagnostic methods have been created for manufacturing systems. Petri nets focus on structure modeling of discrete event systems, making Petri nets a good candidate for PHM with PLC manufacturing applications. Accordingly, Wu and Hsieh (2012) developed a real-time fuzzy Petri net approach to diagnose progressive faults in discrete manufacturing systems, which are usually caused by deterioration or aging. The prototype diagnoser was shown to have a 93 % accuracy for one implementation of a dual robot arm, being able to handle uncertainties and perform multiple fault diagnosis with a maximum diagnosis delay of eight steps (Wu and Hsieh 2012).

To aid the diagnosis of real-time embedded systems, Koutsoukos et al. (2001) presented a framework for modeling faults in hybrid systems. The diagnostic system is composed mainly of a system model, mode estimator, and decision-tree diagnosis. The model uses timed Petri nets, which can describe multiple simultaneous faults and stochastic fluctuations of physical activities. The hybrid model automatically generates the fault symptom table, which is used to produce a decision-tree for diagnostic purposes. Koutsoukos et al. (2001) demonstrated the methodology for a laser printer with faults such as a broken belt and a warn roller. Similarly, Philippot et al. (2012) proposed a decentralized diagnosis approach to diagnose discrete events systems (DESs).

To counteract the challenges associated with the amount of data and variables for process control, Yang and Lee (2012) developed a Bayesian Belief Network (BBN) for diagnostics and prognostics of semiconductor manufacturing systems. The BBN is a statistical model that quantifies probabilistic causal relationships among random variables, whether discrete or continuous (Yang and Lee 2012).

PHM system examples

Application of PHM for various mechanical and electrical components exists, but fewer examples exist for large systems, especially within manufacturing. This section focuses on notable examples of PHM-based systems for large systems, to help motivate future applications of PHM within manufacturing.

PHM-based software has been developed for maintenance applications of discrete systems. The Centre for Maintenance Optimization and Reliability Engineering (C-MORE) at the University of Toronto developed a PHM methodology that uses equipment age data, condition monitoring data, and data concerning the effects of failure and preventative replacement to produce optimal maintenance decisions and reliability functions for equipment (Montgomery et al. 2012). The basic model is composed of a continuous-time non-homogeneous discrete Markov process (Montgomery et al. 2012). The EXAKT software utilizes maintenance records, sensor data, and financial and environmental impacts to select the appropriate maintenance strategy (Oliver Interactive Inc. 2014).

Another example is a multi-symptom-domains behavior assessment methodology developed by the Center for Intelligent Maintenance System (IMS) at the University of Cincinnati for diagnosis and prognosis of various products and machines. The method uses behavior-based cerebellum computation, rather than a model-based computation, and is now part of a Watchdog Agent$^{\mathrm{TM}}$ “digital doctor” (Lee and Kramer 1993). This process overlaps most recent data signatures and those observed during healthy processes or behavior; neither faulty data nor expert knowledge is needed, which is an advantage over other systems. Another advantage is that the Watchdog Agent$^{\mathrm{TM}}$ structure follows the Open Systems Architecture for Condition-Based Maintenance (OSA-CBM), which is a standard structure for PHM information management (MIMOSA 2013b). The Watchdog Agent$^{\mathrm{TM}}$ structure uses both stationary and non-stationary signal processing methods (autoregressive moving-average (ARMA) modeling, wavelets, principal component analysis, etc.) for quantitative heath assessment (Djurdjanovic et al. 2003). Health information (condition, RUL, failure modes, etc.) can be conveyed in a radar chart and made accessible to existing management systems (enterprise, manufacturing, etc.), such as for performance prediction of Siemens rotary machinery (Lee et al. 2011).

Another system incorporates historical and human inputs for logistics. Camci et al. (2007) developed a PHM software tool for the U.S. Air Force that integrates PHM information (RUL, failure modes) and maintenance data (parts, personnel, tools, etc.) for real-time maintenance. The PHM algorithms learn some parameters based on feedback inputted from the human maintainer in response to questions. The evolving PHM algorithms then use historical and real-time data to recommend maintenance actions (Camci et al. 2007). Maintenance effectiveness, equipment availability, reliability, and costs are considered. Thus, the intelligent software develops maintenance solutions within open, dynamic, complex, and distributed environments.

The U.S. DoD has also devoted much effort to develop and advance PHM within major defense systems. The DoD has implemented, or is currently implementing, CBM$^{+}$ (see ‘PHM Frameworks’ section) technologies, processes, and procedures for various systems including Stryker (Army) armored vehicles, AH-64 Apache (Army) helicopters, the Integrated Condition Assessment System (Navy) on ships, the Light Armored Vehicle (Marine Corps), and the Joint Strike Fighter (United States Department of Defense 2008). Furthermore, CBM$^{+}$ is the maintenance component of the Common Logistics Operating Environment (CLOE) for the integration of information across the U.S. Army (United States Army 2014). Such systems will warn operators and field commanders of possible impending failures and assist in maintenance optimization (Ly et al. 2009).

In particular, PHM is a significant part of the JSF. The Autonomic Logistics Information System (ALIS) is its information infrastructure that includes PHM but is even broader; ALIS includes operations, maintenance, PHM, supply chain, customer support, training, and technical data (Lockheed Martin Corporation 2014). ALIS receives health information while the JSF is still in flight, enabling maintainers to prepare for parts before landing; downtime is minimized and efficiency is increased (Lockheed Martin Corporation 2014). The PHM system predicts faults, prognoses failures, tracks part usage, and recommends action to the pilot when necessary (Hess et al. 2004). Operations and maintenance costs for the JSF are estimated to be up to $1T over the next 50 years, so ALIS is vital for sustainment of the JSF fleet (Charette 2014). Similar to the human body in which functions occur autonomically, the JSF Autonomic Logistics system will operate nominally without human intervention to trigger maintenance actions for predicted or unpredicted failures (Hess et al. 2004).

Finally, some manufacturers have created PHM systems for product life cycle management of critical systems. For example, GE Aircraft Engines (GEAE) has characterized the performance, physics, and primary failure modes of their engines and can monitor engines (via 300 operating parameters) for faults and exceedances and then provide trend alerts to customers (Janasak and Beshears 2007). Another critical system is an automobile, and General Motors (GM) serves more than 6 million drivers with PHM for their vehicles through the OnStar service program (Holland et al. 2010). Each GM vehicle is equipped with sensors and an on-board processor, which can send information wirelessly for remote physics- and data-driven diagnostics and prognostics for battery life, tire pressure, and oil life.

Figure 4 summarizes these examples of PHM-based systems for large systems. These examples may help motivate future applications of PHM within manufacturing.

PHM frameworks

Currently, there is no standard maintenance strategy because the optimum maintenance strategy is unique for every company. To help enable the use of PHM concepts for system development, several frameworks have been created. These frameworks are distinguished from the previously discussed PHM examples in the sense that the frameworks are broad methodologies for application in the development of various systems within manufacturing and other industries.

Many industrial practitioners of maintenance models are overwhelmed due to the lack of time or knowledge to study and use these models. Consequently, Waeyenbergh and Pintelon (2009) developed the CIBOCOF framework (in English, the Centre for Industrial Management Maintenance Concept Development Framework), which combines traditional maintenance concepts into a decision support model. The CIBOCOF framework optimizes the maintenance system within various industries (e.g., tobacco, automotive, and electric power) through incremental and understandable steps such as flowcharts (Waeyenbergh and Pintelon 2009). The CIBOCOF framework focuses on various goals (not just cost) and aids in choosing the appropriate maintenance policy and the correct optimization model.

Another framework, more proposed than realized, is a framework by Lee et al. (2011) for “engineering immune systems” that integrates PHM approaches and reliability concepts to achieve production with near-zero breakdowns and minimal human intervention. An engineering immune system (EIS) is reliable in function, robust against failure due to damage, invulnerable to threats, and resilient to disorders. Such a system reacts to disturbances to return the system to a stable state; an engineering immune system detects and adapts itself to anomalies. However, EIS is a new idea that is “still raw” and needs “further research” (Lee et al. 2011). Such a vision for EISs is in the spirit of that for autonomic computing presented by Paul Horn, a former Senior Vice President and Director of Research at IBM Corporation (Horn 2001). Autonomic systems are “self-managing” systems that are self-configuring, self-healing, self-optimizing, and self-protecting that are needed to overcome the increasing complexity in the “next era of computing” and improve performance as well as the “total-cost-of-ownership equation” (Ganek and Corbi 2003). Future advances in diagnostic and prognostic algorithms and system-level “self” capabilities will help to achieve the vision of EISs.

In contrast to use of a theoretical framework, the military is taking steps towards realizing the implementation of its own PHM framework. The CBM$^{+}$ maintenance component mentioned in the ‘PHM System Examples’ section is actually more of a PHM framework. Since December 2007, the U.S. DoD has been shifting its maintenance process to Condition Based Maintenance Plus (CBM$^{+})$, an approach in which maintenance is scheduled based upon the “evidence of need” (United States Department of Defense 2013). Through the use of CBM$^{+}$, traditional “time before overhaul” (TBO)-based maintenance transitions to CBM with a smaller overall maintenance requirement (more predictive, less preventive and corrective maintenance). Asset readiness, safety, and maintainability should be improved through CBM$^{+}$ (United States Department of Defense 2008).

Similar in scope to the DoD, a Systems Analysis and Optimization (SA&O) process was developed by NASA Ames Research Center to determine the integrated vehicle health management (IVHM) framework of the Reusable Launch Vehicle (RLV), but which is general enough to be adaptable for non-IVHM related systems and designs (Datta et al. 2004). The SA&O process incorporates several modular discrete models such as for cost, operations, safety, reliability, false alarm rate, performance, and testability. The modular models relate to each other via inputs and outputs, allowing for any available software tools to provide the appropriate interfaces (Datta et al. 2004). For example, FMECA was obtained from domain experts as input for one RLV application. The IVHM SA&O process provides designers with a toolset to assess the impact of design decisions on the overall system requirements based on a set of 24 desired metrics that quantify the vehicle system safety, cost, and performance for a RLV mission. Consequently, NASA Ames’ SA&O process helps to optimize the system from a system-wide rather than a local perspective (Datta et al. 2004).

Ly et al. (2009) developed an integrated systems-based PHM framework for engineering systems. The systematic methodology integrates PHM elements into a single platform for application in various environments. The enabling technologies are based on health-monitoring software, data-processing methods for feature extraction, diagnostic and prognostic algorithms based on Bayesian estimation theory [specifically particle filtering (Orchard and Vachtsevanos 2007)], fatigue or degradation modeling, and real-time measurements (Ly et al. 2009). As soon as a fault is detected, the PDF for that time is used as an initial condition for the prognostic routines utilizing a nonlinear dynamic state-space model (Patrick et al. 2009).

For electronic systems, Sandborn and Wilkinson (2007) developed a model that utilizes discrete event simulations to determine the optimal PHM approach while taking into account the reliability (time-to-failure, operational hours per year, etc.) and business aspects (unscheduled versus scheduled repair cost, time to repair, etc.) of electronics. All inputs to the discrete event simulation are treated as probability distributions via stochastic analysis implemented as a Monte Carlo simulation.

Other frameworks approach PHM system development from a product lifecycle perspective. Raytheon developed a closed-loop health management systems (HMS) methodology to develop and field HMS capable products (Janasak and Beshears 2007). Raytheon’s HMS methodology contains five phases that span the product life cycle and follow the DoD 5000.2 Instruction: concept refinement, technology development, system development and demonstration, production and deployment, and operation and support (Beshears and Butler 2005). Key tasks for successful implementation of the HMS process were identified as trade studies to set up the HMS architecture, clear HMS requirements, product characterization through analysis, strategically-placed sensors, and data collection/maturation (Beshears and Butler 2005).

Another product lifecycle-based PHM methodology was developed by Shannon et al. (2011), who outlined an overall process for developing a product via four high-level tasks involving diagnostic testing and verification: design product, produce/manufacture product, operate product, and support product. Shannon et al. (2011) recommended the use of standards, whether existing or needing development, for testing and evaluation at various “break points” in the process. Some standards, such as the U.S. Army’s ADS-79D-HDBK, address aspects of testing for PHM systems (Vogl et al. 2014).

Future standards may include a formal notational framework for prognostics, like that developed by Saxena et al. (2010). This framework is composed of four new evaluation metrics: prognostic horizon, $\alpha $-$\lambda $ performance, relative accuracy, and convergence. Saxena et al. (2010) recommend that, for situations in which the normality of the end-of-life distributions is not proven, that the median is used as a measure of location and that quartiles be used as a measure of spread. The framework can be extended to include risk and cost-benefit analyses, effects of schedule maintenance, and connections to KPIs through development of uncertainty management and representation (URM) methods (Saxena et al. 2010).

Figure 5 summarizes some of the PHM frameworks that may become useful for smart manufacturing systems and factories of the future.

PHM best practices

PHM should be used for a paradigm shift in maintenance towards a product life-cycle management in which products are continuously assessed (Djurdjanovic et al. 2003). Aspects of PHM should be based on a plant-wide basis, because PHM is fundamentally for optimization of a total plant (Koochaki et al. 2011). Questions remain as to how the ‘best,’ or most appropriate, PHM system is achieved, considering the vast differences among systems.

To help answer these open questions, this section presents many best practices that have gained some measure of acceptance by practitioners of PHM for manufacturing and defense applications, e.g., General Motors (Barajas and Srinivasa 2008) and the DoD (United States Department of Defense 2008). The best practices for the future of PHM systems are organized into the following subsections based on the categories seen in Fig. 2.

Cost benefit analyses and dependability analyses

Perhaps the most difficult aspect of PHM development is the determination of what system components should be monitored for prognostics, a step that significantly affects the system design. This step involves cost-benefit and/or dependability analyses (Barajas and Srinivasa 2008).

Cost-benefit analysis determines where PHM makes economic sense (United States Department of Defense 2008). The economic benefit of PHM systems is important to determine based on the return on investment (ROI), yet such a metric may be difficult to calculate (Ly et al. 2009). “Fear of initial investment” must be overcome, even though long-term PHM benefits vastly outweigh the startup costs with ROIs on the order of 10:1 (Barajas and Srinivasa 2008). Nonetheless, a business case is needed for optimization of PHM system implementation. The Army’s CBA considers development, procurement, operation, and maintenance costs, as well as monetary and non-monetary benefits (Greitzer et al. 2001). Options with the highest ROI and non-monetary benefits are generally implemented first.

Dependability analysis is the determination of failure mechanisms of critical components and their effects on the system. Such cause-effect relationships need to be identified and understood, e.g., via Failure Mode and Effects Analysis (FMEA) or Failure Mode, Effects, and Criticality Analysis (FMECA) (Janasak and Beshears 2007). FMEA is the most popular method for deterioration and failure analysis, yet is not used extensively in industry. Thus, software should be considered to support the use of FMEA (Takata et al. 2004). Reliability analysis should be used to determine the optimum maintenance tasks based on failure modes and consequences (safety, economical, etc.) (United States Department of Defense 2008).

Data requirements and management

For groups of machines or a general manufacturing system, PHM should be addressed with logistics support (Ly et al. 2009). To this end, reliable PHM data should be efficiently integrated into the company business process; information that is difficult to access or visualize will likely be ignored (Barajas and Srinivasa 2008). Such an integration requires sufficient data management architectures in which the data is easily useable at every needed manufacturing level. For real-time manufacturing plan controls, flexible “middleware” software should minimize hardware and software infrastructure dependencies and allow the management of computer resources (Pereira and Carro 2007).

One best practice of data management that manufacturing has been trending towards is the use of networks at all levels, providing increased reliability, safety, and diagnosability (Moyne and Tilbury 2007). Networking should allow PHM solutions to be deployable with existing advanced process control (APC) systems, but also be flexible for possible additions of health monitoring functions (Moyne et al. 2013).

Another best practice of PHM data management is the use of open-system architectures, which should be incorporated when designing hardware, software, and business processes for maximum interoperability, portability, and scalability (United States Department of Defense 2008; Ly et al. 2009). The next generation of dependable manufacturing systems require good architectures, such as modular object-oriented systems (Pereira and Carro 2007). One example is a diagnostic system for process hazards analysis called PHASuite, which uses a comprehensive knowledge via object-oriented methodologies and unified modeling language within an engineering framework (Venkatasubramanian 2005).

PHM architectures should be open to allow easy updates with knowledge bases and algorithms, providing a major advantage over legacy platforms (Hess et al. 2004). For aircraft, one best practice for open architectures is to develop the on-board and off-board system software together. Otherwise, when developed separately, diagnostic algorithms and analysis techniques have been shown to not reach their full potentials (Hess 2002). For manufacturing, such an information infrastructure is important for life-cycle management of maintenance data (Takata et al. 2004).

Towards the use of open system architectures, PHM developers should apply government and industry standards across an organization. Various standards for open systems architecture were developed by the International Organization for Standardization (ISO) and the Machinery Information Management Open Standards Alliance (MIMOSA), including ISO 13374, ISO 18435, OSA-CBM, and MIMOSA’s Open Systems Architecture for Enterprise Application Integration (OSA-EAI) (MIMOSA 2013a) (United States Department of Defense 2008).

Best practices for data requirements serve those for data management. For example, PHM systems should generate an operating history of each component that is monitored, with the history used within the entire logistics system (Greitzer et al. 2001). Data fields should utilize shared databases and be populated automatically (United States Department of Defense 2008), requiring human intervention only when needed and enable life cycle managers to characterize failures. Parts for maintenance should be automatically ordered and enable just-in-time inventory management by eliminating excess inventory (Greitzer et al. 2001).

Measurement techniques

Best practices also exist for data collection, being vital for PHM system maturation. Data should be collected early in the life cycle to help improve current and future programs (Janasak and Beshears 2007). Yet, PHM developers should invest prudently in sensor, data collection, and analytic capabilities to minimize errors (United States Department of Defense 2008). The Electronic Prognostics and Health Management Research Center at the University of Maryland categorized the main approaches for PHM implementation to include the use of built-in-test (BIT), expendable devices, such as “canaries” and fuses, monitoring of parameters that are precursors to impending failure, and modeling of stress and damage due to exposure conditions (e.g., usage, temperature, and vibration) (Vichare and Pecht 2006).

With respect to measurement techniques, direct condition measurement is usually not possible, so PHM designers must be flexible to use sensors and parameters that are placed for other functional purposes to infer the condition (Hess et al. 2005). Commercial off-the-shelf (COTS) applications should be used to promote the integration of maintenance and logistics information systems (United States Department of Defense 2008). And for the data collected for PHM purposes, pre-processing routines should be utilized with sensor health validation, to reduce false alarms and increase PHM fidelity; and data de-noising, to improve signal-to-noise ratios (Ly et al. 2009). Finally, to increase the effectiveness of maintenance, maintenance technologies should be more extensively connected with design aspects such as product modeling and digital engineering (Takata et al. 2004).

Table 2 Best practices for PHM systems

Full size table

Diagnostics and prognostics

Various best practices exist for the diagnostic and prognostic processes of PHM. As an essential part of PHM, data should be collected by sensors, de-noised, validated to handle missing and abnormal data, normalized between 0 and 1, and correlated (Das et al. 2011). Features should then be extracted as condition indicators in the time domain (mean, standard deviation, root mean square, kurtosis, etc.) or in the frequency or time-frequency domains through various methods (band filters, power spectral density analysis, cepstrum analysis, wavelet analysis, etc.). The metrics are then used with various machine/statistical learning methods (e.g., decision tree learning, neural networks, support vector machines, Bayesian networks) to build a model that correlates the metrics to the system behavior (Das et al. 2011).

Fault detection, isolation, and prediction capabilities should be designed for PHM systems (United States Department of Defense 2008). PHM practitioners should continue to develop new methods and methodologies, yet leverage the existing methods into reusable modules that can be broadly applied to future PHM systems (Janasak and Beshears 2007). Physics-based modeling should be used for critical elements for a better understanding of failure mechanisms and propagation times, and a methodology should be utilized for incipient failure detection with a specified degree of confidence and given false alarm rate (Ly et al. 2009). Expert system software should be considered, as well, for accurate condition-based monitoring (United States Department of Defense 2008).

Another best practice is quick prognostics based on comparison of monitored operating conditions and a look-up table, generated a priori, which relates specific operating conditions to accumulated damage, e.g., stress rupture due to creep (Roemer et al. 2001). Alternatively, reduced-order models may be used for on-line PHM, which is based on off-line, and often time-consuming, computational analysis. RUL predictions can also be based on various data sources, so prognostic methods should be designed with flexibility for the use of data from multiple sources (United States Department of Defense 2008).

Finally, PHM metrics should be based on CBM information for performance assessments (United States Department of Defense 2008). Specifically, feature analysis and condition metrics should be selected and extracted for accurate and reliable fault diagnosis and failure prognosis (Ly et al. 2009). PHM will not work well when performance metrics are ill-defined or inconsistent; the use of predicted KPIs is the basis for PHM success (Barajas and Srinivasa 2008).

Testing and training

Testing of PHM systems appears to be a very specialized, yet critical, process for PHM system development. Future standards could be developed to address verification and validation (V&V) without relying heavily on current examples. Verification and validation is needed for PHM systems, and one such approach is seeded fault testing, if possible, to test the robustness of algorithms. A prognostics model must be trained and then tested on different data. Techniques such as ensemble learning and cross validation can then be used to choose the best prediction method based on performance (Das et al. 2011).

Another important best practice that appears to be lacking is the incorporation of the “human factor” in PHM systems; human-driven predictive diagnostics by the right people with the right information is required (Barajas and Srinivasa 2008). Also, any PHM system must be accepted and utilized by trained personnel. Hence, Bird et al. (2014) extended the generic taxonomy for IVHM developed by Jennions (2011) to include a comprehensive list of skills and capabilities needed by workers in the PHM field. The proposed taxonomy can be used by employers and practitioners in PHM for hiring and training (Bird et al. 2014).

Summary of best practices

Table 2 summarizes some of the key best practices of PHM systems that could be generally adopted to enable the future of PHM within smart manufacturing systems. As seen in Table 2, the best practices include the adoption of networking and data management for open system architectures as well as flexible diagnostic and prognostic methods that can be verified and validated.

PHM enablers for manufacturing

While the research described in the previous sections summarizes the work that has occurred to advance prognostics and health management and highlights the promise of PHM, the field remains an emerging discipline. There are challenges and needs that must still be overcome to encourage wide adoption of PHM, especially in the manufacturing domain. Some of the most critical remaining challenges include real-time diagnostic and prognostic methods, standards for PHM system evaluation, and data integration within user-friendly PHM systems. Despite these challenges, manufacturers have a strong interest in PHM since they desire improved diagnostic capabilities and view prediction as the logical goal of digital technologies in manufacturing (Helu and Weiss 2016).

The recognized need for PHM as part of the larger theme of advanced manufacturing has motivated much of the current research aimed at enabling PHM solutions for manufacturing systems. For example, the National Institute of Standards and Technology (NIST) has been focused on developing methods, protocols, and tools to enable robust sensing, monitoring, diagnosis, prognosis, and control for PHM in manufacturing (National Institute of Standards and Technology 2015, 2016). A significant part of this research is the design and use of test beds to support the development of PHM across multiple control levels in a manufacturing system (Vogl et al. 2015). The goal of these test beds is to generate:

Data and information requirements to enable interoperability between various heterogeneous manufacturing systems
Reference architectures and implementations to aid the design and execution of PHM systems and applications
Best practices and guidelines to use PHM systems and applications
Reference datasets and test scenarios to support PHM research and the evaluation, verification, and validation of PHM capabilities
Physical infrastructure to evaluate enhancements to standards and technologies for PHM and educate the manufacturing community of the promise of PHM through demonstration

Each of these products would promote PHM solutions in the manufacturing domain by providing manufacturers with the support needed to implement, test, and use a PHM system. These products can also identify opportunities and define requirements for standards and technologies not yet envisioned that can enable the use of PHM across larger segments of the manufacturing domain. Towards this end, Fig. 6 shows a schematic of how the various PHM-related test beds at NIST are interconnected to span multiple control levels in a manufacturing system.

Current standardization activities, especially in the area of data interoperability, are also important enablers for PHM in manufacturing. Potentially valuable data and information exists in various manufacturing systems from the devices that exist on the shop floor (e.g., production equipment) to the higher-level manufacturing execution (MES) and enterprise resource planning (ERP) systems. However, the data and information may not be presented and interpreted consistently by every system. It may also be difficult to transport the data and information between systems because of a lack of common interfaces.

One standard that seeks to address data interoperability issues in the manufacturing domain is MTConnect. MTConnect is an open-source, read-only, data-exchange standard for manufacturing equipment and applications developed by the Association for Manufacturing Technology (MTConnect Institute 2015). It provides a common vocabulary and structure for manufacturing data to enhance the data acquisition capabilities of manufacturing equipment and applications. The standard also includes communications protocols to provide access to manufacturing data. MTConnect addresses real and near-real-time data from manufacturing equipment (e.g., the current speed or position from a machine tool controller), and enhancements to the standard currently under development intend to extend the standard to enable interoperability between systems and applications across the equipment, facility, and enterprise levels of a manufacturing system. MTConnect is an important enabler of PHM by providing access to data and information on the as-executed status of a part, which is needed to identify and diagnose problems with manufacturing equipment and systems as well as predict problems before they may occur.

Conclusions

As seen in Fig. 1, production and maintenance paradigms have evolved with technological advances and societal needs. Production is becoming more customized yet efficient in a globally competitive environment. No longer is production expected to be halted to fix broken systems (reactive maintenance), but rather, production is heading towards the ideal of self-maintenance in which systems monitor themselves. In such a scenario, prognostics and health management would help manufacturers to optimize production and prevent production stoppages as needed.

As shown herein, prognostics and health management remains an emerging discipline and shows promise due to successful application in various manufacturing processes and products. However, there are challenges and needs that must be overcome for the widespread realization of PHM within manufacturing. Based on current capabilities, the critical challenges are real-time diagnostic and prognostic methods, standards for PHM system evaluation, and the integration of data (from sensors, PLC, experts, etc.) within user-friendly PHM systems.

Despite these challenges, successful implementation of PHM has led to certain best practices gained by practitioners of PHM for manufacturing and defense applications. Some of the most critical best practices to be adopted are related to networking and data management for open system architectures as well as flexible diagnostic and prognostic methods that can be verified and validated. Specifically, PHM data should be efficiently integrated into the company business process, and PHM methods should be both reliable and flexible for use with multiple data sources.

As challenges are overcome and best practices are implemented and updated, PHM will help manufacturing systems evolve into ‘smart’ manufacturing systems for the realization of the self-maintenance paradigm. Towards this end, collaborations among PHM experts are recommended for the generation of solutions that fill high-priority gaps for manufacturing systems. These solutions may be aided by and help influence the creation and extension of standards related to PHM (Vogl et al. 2014) and data interoperability in the manufacturing domain (MTConnect Institute 2015).

Notes

Official contribution of the National Institute of Standards and Technology (NIST); not subject to copyright in the United States. Certain commercial products, some of which are either registered or trademarked, are identified in this paper in order to adequately specify certain procedures. In no case does such identification imply recommendation or endorsement by NIST, nor does it imply that the materials, equipment, or software identified are necessarily the best available for the purpose.

Abbreviations

(APC):: Advanced process control
(ANN):: Artificial neural network
(CBM):: Condition-based maintenance
(CBA):: Cost-benefit analysis
(DES):: Discrete events system
(FMEA):: Failure mode and effects analysis
(FMECA):: Failure mode, effects, and criticality analysis
(FTA):: Fault tree analysis
(IVHM):: Integrated vehicle health management
(KPI):: Key performance indicator
(PCA):: Principal component analysis
(PDF):: Probability density function
(PHM):: Prognostics and health management
(PLC):: Programmable logic controller
(RUL):: Remaining useful life
(ROI):: Return on investment
(SVM):: Support vector machine
(TBM):: Time-based maintenance

References

Ahmad, R., & Kamaruddin, S. (2012). An overview of time-based and condition-based maintenance in industrial application. Computers & Industrial Engineering, 63(1), 135–149.
Article Google Scholar
Banks, J., & Merenich, J. (2007). Cost benefit analysis for asset health management technology. In 53rd Annual Reliability and Maintainability Sympsoium, RAMS 2007 (pp. 95–100).
Barajas, L. G., & Srinivasa, N. (2008). Real-time diagnostics, prognostics health management for large-scale manufacturing maintenance systems. In ASME International Manufacturing Science and Engineering Conference, MSEC2008 (pp. 85–94).
Beshears, R., & Butler, L. (2005). Designing for health: A methodology for integrated diagnostics/prognostics. In 2005 IEEE Autotestcon (pp. 90–95).
Bird, J., Madge, N., & Reichard, K. (2014). Towards a capabilities taxonomy for prognostics and health management. International Journal of Prognostics and Health Management, 5(2).
Camci, F., Valentine, G. S., & Navarra, K. (2007). Methodologies for integration of PHM systems with maintenance data. In 2007 IEEE Aerospace Conference (p. 4161674).
Charette, R. N. (2014). Software testing problems continue to plague F-35 joint strike fighter program. http://spectrum.ieee.org/riskfactor/aerospace/aviation/software-testing-problems-continue-to-plague-f35-joint-strike-fighter-program.
Das, S., Hall, R., Herzog, S., Harrison, G., & Bodkin, M. (2011). Essential steps in prognostic health management. In 2011 IEEE International Conference on Prognostics and Health Management, PHM 2011.
Datta, K., Jize, N., Maclise, D., & Goggin, D. (2004). An IVHM systems analysis & optimization process. In 2004 IEEE Aerospace Conference (pp. 3706–3716).
Djurdjanovic, D., Lee, J., & Ni, J. (2003). Watchdog agent-an infotronics-based prognostics approach for product performance degradation assessment and prediction. Advanced Engineering Informatics, 17(3–4), 109–125.
Article Google Scholar
Drummond, C., & Yang, C. (2008). Reverse-engineering costs: How much will a prognostic algorithm save? In International Conference on Prognostics and Health Management.
Engel, S. J., Gilmartin, B. J., Bongort, K., & Hess, A. (2000). Prognostics, the real issues involved with predicting life remaining. In 2000 IEEE Aerospace Conference (pp. 457–469).
Feldman, K., Sandborn, P., & Jazouli, T. (2008). The analysis of return on Investment For PHM applied to electronic systems. In 2008 International Conference on Prognostics and Health Management, PHM 2008.
Ganek, A. G., & Corbi, T. A. (2003). The dawning of the autonomic computing era. IBM Systems Journal, 42(1), 5–18.
Article Google Scholar
Greitzer, F. L., Hostick, C. J., Rhoads, R. E., & Keeney, M. (2001). Determining how to do prognostics, and then determining what to do with it. In AUTOTESTCON Proceedings, 2001. IEEE Systems Readiness Technology Conference (pp. 780–792).
Grubic, T., Jennions, I., & Baines, T. (2009). The interaction of PSS and PHM - a mutual benefit case. In 2009 Prognostics and System Health Management Conference, PHM 2009.
Helu, M., & Weiss, B. A. (2016). The current state of sensing, health management, and control for small-to-medium-szed manufacturers. In ASME 2016 Manufacturing Science and Engineering Conference, MSEC2016.
Hess, A. (2002). Prognostics, from the need to reality-from the fleet users and PHM system designer/developers perspectives. In 2002 IEEE Aerospace Conference (pp. 2791–2797).
Hess, A., Calvello, G., & Dabney, T. (2004). PHM a key enabler for the JSF autonomic logistics support concept. In 2004 IEEE Aerospace Conference (pp. 3543–3550).
Hess, A., Calvello, G., & Frith, P. (2005). Challenges, issues, and lessons learned chasing the” Big P”: Real predictive prognostics Part 1. In 2005 IEEE Aerospace Conference (pp. 3610–3619).
Holland, S. W., Barajas, L. G., Salman, M., & Zhang, Y. (2010). PHM for automotive manufacturing & vehicle applications. In Prognostics & Health Management Conference.
Horn, P. (2001). Autonomic computing: IBM’s perspective on the state of information technology. Armonk, New York: International Business Machines Corporation.
Janasak, K. M., & Beshears, R. R. (2007). Diagnostics to prognostics—a product availability technology evolution. In Reliability and Maintainability Symposium, 2007 (RAMS ’07) (pp. 113–118).
Jennions, I. K. (2011). Integrated vehicle health management: Perspectives on an emerging field. Warrendale, PA: SAE International.
Jovane, F., Koren, Y., & Boer, C. R. (2003). Present and future of flexible automation: Towards new paradigms. CIRP Annals - Manufacturing Technology, 52(2), 543–560.
Article Google Scholar
Kalgren, P. W., Byington, C. S., Roemer, M. J., & Watson, M. J. (2007). Defining PHM, a lexical evolution of maintenance and logistics. In 2006 IEEE AUTOTESTCON - IEEE Systems Readiness Technology Conference (pp. 353–358).
Koochaki, J., Bokhorst, J., Wortmann, H., & Klingenberg, W. (2011). Evaluating condition based maintenance effectiveness for two processes in series. Journal of Quality in Maintenance Engineering, 17(4), 398–414.
Article Google Scholar
Kothamasu, R., Huang, S. H., & VerDuin, W. H. (2006). System health monitoring and prognostics - a review of current paradigms and practices. The International Journal of Advanced Manufacturing Technology, 28, 1012–1024.
Article Google Scholar
Koutsoukos, X., Zhao, F., Haussecker, H., Reich, J., & Cheung, P. (2001). Fault modeling for monitoring and diagnosis of sensor-rich hybrid systems. In 40th IEEE Conference on Decision and Control (CDC) (pp. 793–801).
Lee, J., Ghaffari, M., & Elmeligy, S. (2011). Self-maintenance and engineering immune systems: Towards smarter machines and manufacturing systems. Annual Reviews in Control, 35(1), 111–122.
Article Google Scholar
Lee, J., & Kramer, B. M. (1993). Analysis of machine degradation using a neural network based pattern discrimination model. Journal of Manufacturing Systems, 12(5), 379–387.
Article Google Scholar
Lee, J., Lapira, E., Bagheri, B., & Kao, H.-A. (2013). Recent advances and trends in predictive manufacturing systems in big data environment. Manufacturing Letters, 1(1), 38–41.
Article Google Scholar
Lockheed Martin Corporation (2014). ALIS: The F-35 Information Infrastructure. http://www.lockheedmartin.com/us/products/f35/f35-sustainment/alis.html.
Ly, C., Tom, K., Byington, C. S., Patrick, R., & Vachtsevanos, G. J. (2009). Fault diagnosis and failure prognosis for engineering systems: A global perspective. In 2009 IEEE International Conference on Automation Science and Engineering, CASE 2009 (pp. 108–115).
MIMOSA (2013a). MIMOSA’s open system architecture for enterprise application integration (OSA-EAI). http://www.mimosa.org/?q=node/300.
Mimosa (2013b). OSA-CBM 3.3.0. http://www.mimosa.org/?q=resources/specs/osa-cbm-330.
Montgomery, N., Banjevic, D., & Jardine, A. K. S. (2012). Minor maintenance actions and their impact on diagnostic and prognostic CBM models. Journal of Intelligent Manufacturing, 23(2), 303–311.
Article Google Scholar
Moyne, J., Iskandar, J., Hawkins, P., Furest, A., Pollard, B., Walker, T., et al. (2013). Deploying an equipment health monitoring dashboard and assessing predictive maintenance. In Advanced Semiconductor Manufacturing Conference (ASMC), 2013 24th Annual SEMI (pp. 105–110).
Moyne, J., & Tilbury, D. (2007). The emergence of industrial control networks for manufacturing control, diagnostics, and safety data. Proceedings of the IEEE, 95(1), 29–47.
Article Google Scholar
MTConnect institute (2015). MTConnect v. 1.3.1. http://www.mtconnect.org/standard?terms=on.
National Institute of standards and technology (2015). Measurement science roadmap for prognostics and health management for smart manufacturing systems. http://www.nist.gov/el/isd/upload/Measurement-Science-Roadmapping-Workshop-Final-Report.pdf.
National Institute of standards and technology (2016). Prognostics, health management, and control (PHMC). http://www.nist.gov/el/isd/ks/phmc.cfm.
Oliver Interactive Inc. (2014). EXAKT: Condition based maintenance software. http://www.oliver-group.com/exakt.asp.
Orchard, M. E., & Vachtsevanos, G. J. (2007). A particle filtering-based framework for real-time fault diagnosis and failure prognosis in a turbine engine. In 2007 Mediterranean Conference on Control and Automation, MED.
Patrick, R., Smith, M. J., Zhang, B., Byington, C. S., Vachtsevanos, G. J., & Del Rosario, R. (2009). Diagnostic enhancements for air vehicle HUMS to increase prognostic system effectiveness. In 2009 IEEE Aerospace Conference, March 7–14.
Peng, Y., Dong, M., & Zuo, M. J. (2010). Current status of machine prognostics in condition-based maintenance: A review. The International Journal of Advanced Manufacturing Technology, 50(1–4), 297–313.
Article Google Scholar
Pereira, C. E., & Carro, L. (2007). Distributed real-time embedded systems: Recent advances, future trends and their impact on manufacturing plant control. Annual Reviews in Control, 31(1), 81–92.
Article Google Scholar
Philippot, A., Marangé, P., Carré-Ménétrier, V., & Riera, B. (2012). Implementation of diagnosis approach for discrete event systems. In International Symposium on Security and Safety of Complex Systems, 2SCS’12.
Roemer, M. J., Nwadiogbu, E., & Bloor, G. (2001). Development of diagnostic and prognostic technologies for aerospace health management applications. In 2001 IEEE Aerospace Conference (pp. 3139–3147).
Sandborn, P. A., & Wilkinson, C. (2007). A maintenance planning and business case development model for the application of prognostics and health management (PHM) to electronic systems. Microelectronics Reliability, 47(12), 1889–1901.
Article Google Scholar
Saxena, A., Celaya, J., Saha, B., Saha, S., & Goebel, K. (2010). Metrics for offline evaluation of prognostic performance. International Journal of Prognostics and Health Management, 1, 1.
Google Scholar
Schwabacher, M. A. (2005). A survey of data-driven prognostics. In InfoTech at Aerospace: Advancing Contemporary Aerospace Technologies and Their Integration, September 26–29 (pp. 887–891).
Shannon, R., & Knecht, J. (2010). Optimizing diagnostic verification processes. In 45 Years of Support Innovation - Moving Forward at the Speed of Light, AUTOTESTCON 2010 (pp. 222–226).
Shannon, R., Modi, M., & Stanco, J. (2011). Achieving optimum test by applying standardization. In Systems Readiness Technology Conference: “Transforming Maintenance through Advanced Test, Diagnosis and Prognosis”, AUTOTESTCON 2011 (pp. 130–134).
Sun, B., Zeng, S., Kang, R., & Pecht, M. (2010). Benefits analysis of prognostics in systems. In 2010 Prognostics and System Health Management Conference, PHM 2010.
Takata, S., Kirnura, F., Van Houten, F., Westkamper, E., Shpitalni, M., Ceglarek, D., et al. (2004). Maintenance: Changing role in life cycle management. CIRP Annals-Manufacturing Technology, 53(2), 643–655.
Article Google Scholar
Tian, Z., Lin, D., & Wu, B. (2012). Condition based maintenance optimization considering multiple objectives. Journal of Intelligent Manufacturing, 23(2), 333–340.
Article Google Scholar
United States army (2014). CBM+ is the maintenance component of CLOE. https://lia.army.mil/CLOE/CBM.htm.
United States department of defense (2008). Condition Based Maintenance Plus DoD Guidebook. https://acc.dau.mil/cbm-guidebook.
United States department of defense (2013). Condition Based Maintenance Plus (CBM $^{+}$): 2013 CMB $^{+}$ Plan. http://www.acq.osd.mil/log/mpp/cbm+/docs/CBM+_Plan2013v2.pdf.
Veldman, J., Wortmann, H., & Klingenberg, W. (2011). Typology of condition based maintenance. Journal of Quality in Maintenance Engineering, 17(2), 183–202.
Article Google Scholar
Venkatasubramanian, V. (2005). Prognostic and diagnostic monitoring of complex systems for product lifecycle management: Challenges and opportunities. Computers & chemical engineering, 29(6), 1253–1263.
Article Google Scholar
Vichare, N. M., & Pecht, M. G. (2006). Prognostics and health management of electronics. IEEE Transactions on Components and Packaging Technologies, 29(1), 222–229.
Article Google Scholar
Vogl, G. W., Weiss, B. A., & Donmez, M. A. (2014). Standards related to prognostics and health management (PHM) for manufacturing. National Institute of Standards and Technology (NIST), Gaithersburg, Maryland, USA, NISTIR 8012. doi:10.6028/NIST.IR.8012
Vogl, G. W., Weiss, B. A., & Donmez, M. A. (2015). A sensor-based method for diagnostics of machine tool linear axes. Paper presented at the Annual Conference of the Prognostics and Health Management Society 2015, Coronado, CA, October 18–24, 2015
Waeyenbergh, G., & Pintelon, L. (2009). CIBOCOF: A framework for industrial maintenance concept development. International Journal of Production Economics, 121(2), 633–640.
Article Google Scholar
Wu, Z., & Hsieh, S.-J. (2012). A realtime fuzzy Petri net diagnoser for detecting progressive faults in PLC based discrete manufacturing system. The International Journal of Advanced Manufacturing Technology, 61(1–4), 405–421.
Article Google Scholar
Yang, L., & Lee, J. (2012). Bayesian belief network-based approach for diagnostics and prognostics of semiconductor manufacturing systems. Robotics and Computer-Integrated Manufacturing, 28(1), 66–74.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Engineering Laboratory, National Institute of Standards and Technology (NIST), 100 Bureau Drive, Gaithersburg, MD, 20899-8220, USA
Gregory W. Vogl, Brian A. Weiss & Moneer Helu

Authors

Gregory W. Vogl
View author publications
You can also search for this author in PubMed Google Scholar
Brian A. Weiss
View author publications
You can also search for this author in PubMed Google Scholar
Moneer Helu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gregory W. Vogl.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vogl, G.W., Weiss, B.A. & Helu, M. A review of diagnostic and prognostic capabilities and best practices for manufacturing. J Intell Manuf 30, 79–95 (2019). https://doi.org/10.1007/s10845-016-1228-8

Download citation

Received: 29 October 2015
Accepted: 21 May 2016
Published: 09 June 2016
Issue Date: 31 January 2019
DOI: https://doi.org/10.1007/s10845-016-1228-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A review of diagnostic and prognostic capabilities and best practices for manufacturing

Abstract

Similar content being viewed by others

A Framework for Prognostics and Health Management Applications toward Smart Manufacturing Systems

Risk Analysis and Prognostics and Health Management for Smart Manufacturing

Manufacturing Paradigm-Oriented PHM Methodologies for Cyber-Physical Systems

Explore related subjects

Introduction

Prognostics and health management (PHM)

Current PHM challenges and future needs

Diagnostics

Prognostics

Components and PHM architecture

Business level

Human factors

Summary of challenges

PHM methods

Diagnostic and prognostic methods

Hybrid/fusion methods

Cost and dependability methods

Methods for discrete manufacturing systems

PHM system examples

PHM frameworks

PHM best practices

Cost benefit analyses and dependability analyses

Data requirements and management

Measurement techniques

Diagnostics and prognostics

Testing and training

Summary of best practices

PHM enablers for manufacturing

Conclusions

Notes

Abbreviations

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation