Keywords

1 Introduction

1.1 The Ultimate System

In a perfect scenario, one would like to have a condition monitoring system, which just requires sensors mounting followed by pressing the “START” button or by plugging in the embedded system, and which provides completely reliable information about each machine part in a form like “Bearing degradation level: 77% (8 weeks to critical failure)”. If so, why not connect this reliable system to the maintenance planning system, and to order parts and schedule repairs automatically? As well as it sounds, today it would be difficult to find any CEO that would agree to have a system which takes over financial strategy (turning it into potentially deadly scheme). It seems like on one hand industry more and more calls for “intelligent”, “smart”, “autonomous” systems, capable of automatized data collection and analysis, but on the other hand, manufacturers try to achieve this “smart” status with minimum modifications of currently offered systems, because these systems are reliable, effective, and most importantly verified. This paper therefore attempts to explain the actual meaning of “smart” system, how this “smartness” is achieved, and finally what consequences on the overall CMS performance “smartness” has. The paper has a conceptual character.

Smart CMS, by definition, aims in automation of all actions within condition monitoring, from which the machine-operator graphical interface draws most attention, simply because it is most eye-appealing, like demonstrated in Fig. 1.

Fig. 1
figure 1

Exemplary visualization of a smart CMS [available @ Allied Reliability_eBook_Industrial Evolution.pdf]

The remaining parts, including selection and configuration stay in the shadow for the reasons given in the paper. Imagine a beginner that uses equipment, which gives information like “Large imbalance detected. Stop and fix.” Probably, he would be very satisfied, and would order immediate repair. On the other hand, if it happened to an experienced machine operator, he would ask the system for velocity order spectrum. As a consequence, system’s advanced diagnostic options (like data selection and spectrum display) are sometimes desired and sometimes detrimental. For many years, this observation led CMS manufacturers to prepare a large CMS portfolio, typically covering from basic, 1 or 2-channel devices with basic scalar diagnostic estimators, through portable data analyzers and wireless systems, to multi-channel distributed systems with separated modulus for data collection and data analysis. Naturally, over the years, many companies prepared platforms, which enable integration of data from any of listed types of equipment, like Emerson® Plantweb OpticsTM [1]. Other providers, like Allied Reliability® recommend external PTC ThingWorx platform [2].

1.2 What Is Smart Monitoring?

It is hard to tell, because nearly all currently available commercial condition monitoring systems claim that they are smart. For instance, Smart Condition Monitoring from Mitsubishi ElectricTM claims to create a “memory map” of a normal operating condition and to use “sophisticated algorithms” to detect abnormal state and offers “better understanding” of machine defect due to “higher level network”. Simultaneously, GE states to use the same algorithms as “big data companies” analyzing the current behavior and past behavior of the plant. Allied Reliability promotes SMARTCMB as a system that is IIOT (Industrial Internet of Things) “ready”, and that it increases uptime and decreases maintenance cost. Others, [3] emphasize the role of smartphones in enhancement of the effectiveness of condition monitoring systems for reliable machinery protection. Finally, some latest solutions like [4] refer to smart “on-site machine diagnostics” as an alternative to “traditional cloud-based technology”. Obviously, such contradictory scope might be a bit confusing.

1.3 Smart Systems Versus Smart Staff

Smart CMS offer “easier” installation and “easier” data analysis. In case of system commissioning, easier installation typically means more default settings within system configuration. Easier data analysis could be realized in two general ways. In the first case, automatized machine diagnostics is realized as a simple transformation of predefined data containers into descriptive information. For instance, the amplitude of shaft order could be tracked and converted to “Imbalance” level. The second general set of methods refers to Data Science analysis, like pattern recognition or ANN algorithms. In this case, the operator is somewhat compelled to “believe” in the system outcome. In both cases, smart systems inevitably subtly yet craftily remove skilled workers form individual partial actions within entire condition monitoring process. The key point is to analyze which steps of a human work could be efficiently replaced by a program, and which could not. Of course, the answer to this question is not simple; nevertheless, the answer that all the actions could be successfully replaced seems incorrect today. For practical goals, the paper shows few examples of successful implementation of smart methods in CMS. Worth mentioning, many diagnostic engineers from large companies complain that they regularly undergo shifting from one department to another, resulting in inability of mastering in a specific branch of technical science. Consequently, many machine diagnostic engineers do not have a solid background in classical condition monitoring methods; therefore, they tend to overestimate the capabilities of “smart” systems, believing them to be a perfect remedy to all their concerns.

2 Evolution of Condition Monitoring Systems

2.1 Early Days

First condition monitoring systems were developed for protection of high value assets, typically in power generation or chemical industries. The value of machinery and enormous costs of lost production (not mentioning the need to rebuild the plant itself) were so immense that it justified very high costs of development. As a result, the first condition monitoring systems were very expensive as well. The very first systems used analogue electronics, which was fast replaced by digital circuits.

Since the early days there was a distinction between monitoring and diagnostics. Monitoring (also referred to as protection) is a must for industrial machinery and is the first their functionality. Reaction of the system must be taken very fast and in a fully automated mode. It is necessary to react in milliseconds to an unexpected sudden event, for instance a broken turbine blade. In such a case the protected machine must be brought to stop before consecutive damage will happen. The second level is diagnostics, focused on early detection of faults. While protection systems only calculate few signal features, the diagnostics level involve calculating numerous advanced signal features, e.g. narrowband rolling bearing features. The system tracks trends of features and is able to detect early signs of technical state deterioration, even when the machine is still perfectly functional.

2.2 Expansion of Stationary Distributed Systems

Two major trends shaped the development of CMS, namely rapid development of digital technologies and—at the same time—equally rapid decrease of IT technology prices. Since many signal analysis methods were developed, standards (primarily ISO10816 and ISO7919) were needed to keep compatibility necessary to compare vibration levels between machines and systems. The protection systems began to proliferate into more and more assets. The distinction between the two layers became standard for critical machinery, e.g. power generation and oil and gas. It was adopted by standards (API670) which explicitly requires that these two layers should be separated into different computer systems. Moreover, failure of the diagnostics layer must not compromise the operation of the protection layer. Such a safety was achieved at the cost of more expensive CMS. In numerous other, less critical applications, where potential losses are smaller and fault development slower, the approach towards CMS reliability is not as demanding. It is common to mix the two functions in a single CMS. Dozens of manufacturers started to develop and offer much simpler (and less expensive) systems. These were installed in many other industries, starting from auxiliary machinery in critical plants, to transportation, food, marine to name only a few.

2.3 Industrial Internet-of-Things

The next big change was driven by further explosion of IT capabilities at continuously lower costs mixed with the advent of enhanced communication (including wireless). More and more machinery could be equipped with a CMS. Decreasing prices could justify smaller and smaller benefits (though still substantial). Other trends included cloud based systems, where the data from hundreds of CMS were sent, stored and analyzed by remote servers. The default tool to access the data became a web browser. Other consequence was also decreasing level of skills, as the vibration-based features were presented to normal machine operators, without any exposure to vibration analysis.

3 CMS Interaction with Human

3.1 Selection

The true meaning of a suitable selection of CMS is typically underestimated due to few reasons. Firstly, not many people are familiar with various types of such systems. If one works with portable equipment exclusively, he will seek for better portable equipment disregarding stationary systems, and vice-versa. Secondly, CMS are selected by management staff on the basis of business plans, which generally boils down to cheapest systems. In this case, the idea is that any CMS is equally good for the job. Thirdly, in many plants, the equipment is partially or totally inherited, which limits potential changes, because in nearly all cases, systems from different manufacturers are not compatible. As a result, in many applications, systems are not suitable for any significant improvements permanently from the start. As a very common example of unsuitable selection of CMS elements one could consider a set of acceleration sensors with 100 mV/g sensitivity for a high-volume machine with a relatively large transmission ratio, for which the vibration level between front and back end easily differs by more than order of magnitude. Situation where suitable sensors with higher sensitivity are installed at locations with smaller vibrations are rarely met in practice. Therefore, it might be concluded that fundamental rules of selection of suitable CMS for individual scenario should be followed prior to consideration of system “smart” features. Other words, it is NOT recommended to select a system which suits ones needs from available smart systems, but rather to look for a suitable system without adding any initial value to “smart” class of system.

3.2 Configuration

Among various actions, which refer to the process of machine condition monitoring, configuration of the system is a major taboo—it is skipped, it is depreciated, and it is disliked. This popular approach, which underestimates the meaning of CMS configuration, is like a minefield, because configuration decides what data is processed, when it is processed, and how it is processed. Moreover, configuration process itself is long and costly, yet it does not bring any direct benefit to the operator (or the plant), so it is treated as a necessary evil. As a result, configuration is omitted during business system presentation, and is shifted to support actions. During the first training, frequently it is found to be much more troublesome than system operation. For beginning users, the less optional the configuration the better. For more advanced users, it is just the opposite. As a result, it is very difficult to provide a configuration interface, which would satisfy a large number of users.

Configuration is typically divided into few phases. First phase refers to system pre-configuration, which is done by the manufacturer and it is exaggerated to make place for further adjustment. For stationary systems and advanced portable systems, initial configuration also includes definition of drive train kinetostatics (frequently called “kinematics”) and narrowband analyses. Each narrowband analysis includes a configuration subset, which covers spectrum type, spectral range, optional filters, amplitude type (peak, root-mean square, power, sum), etc. In the third phase, data is additionally classified into operational states, so that vibrations only in similar machine dynamic states are compared.

Definitely, successful replacement of human actions within CMS configuration process is exceptionally attractive. But what exactly would it mean when each configuration element is selected individually? Selection of sampling frequency is generally fixed, so is the length of signals. The location of each sensor is taken either from norms or from human experience. Next, almost all commercially available systems automatically calculate narrowband analyses on the basis of MANUALLY prepared kinetostatic configuration. Is it possible to further automatize any of these parts? So far, it is noticed that “smart” configuration features are limited to simple actions, like automatic determination of shaft-related analyses on the basis of the phase marker (PM) signal or automatic triggering for data storage. An interesting solution for automatic threshold configuration for scalar trend analyses could be found in a modern AVM4000 system [5], which is based on percentile limits of cumulative distribution functions. It might be therefore concluded that smart systems should prepare large configurations automatically, but this approach might not be correct at all. Alternatively, large static configuration of a system could be skipped, as long as the system is not expected to give fault identification, i.e. just fault detection and possibly fault severity assessment.

3.3 Operation

Operation of CMS refers to the direct interaction of a machine operator with the system and it is composed of different elements depending on the system architecture. For unsupervised protection systems, desired system interaction is none. For simple portable systems, data acquisition is triggered, followed by internal signal processing. The displayed data is analyzed by the operator on-site. In case of stationary distributed systems, data is transferred to some central unit, to which a diagnostic engineer is connected. Desirably, such systems operate on events, which are signals to the engineers that machine needs attention on the basis of the current data. From the operation point-of-view, smart system could refer to two aspects, namely data transfer and data analysis. Firstly, in any of mentioned systems, a smart system could be connected to some network enabling automatic data transfer. This is especially attractive to portable systems, where such feature significantly saves time.

Secondly, in case of portable and stationary monitoring systems, smart operation refers to automatic data analysis. This data analysis answers three fundamental questions:

  1. 1.

    Is there (a new) machine fault?

  2. 2.

    What is the fault element?

  3. 3.

    How serious is the fault?

The first question refers to fault detection, the second to fault identification, while the third one to severity assessment. In case of a smart vibration-based condition monitoring system, the first concern is realized by unsupervised anomaly detection. In this scenario, a classically “permissible” machine technical state is classified as a “normal” state, while any significant deviation from this state is called an “anomaly” or “abnormal” state. Although a commonly accepted classification of vibration-based data science methods does not exist so far, in this paper it is accepted that “machine learning” covers all unsupervised methods, which operate on predefined scalar diagnostic estimators (also called “health indicators” HI or signal “features”), while “deep learning” refers to all unsupervised methods operating on raw vibration data.

Unsupervised analysis based on scalar diagnostic indicators is a bit tricky. Before any of such analysis is done, it needs to be stated that three types of indicators exist. The first group is wideband indicators, which means that they cover “entire” signal in some domain. These indicators include peak-to-peak (PP), root mean square (RMS), crest factor, and kurtosis, from both, acceleration and velocity signals. The second are narrowband indicators, typically calculated in frequency (or order) domain. The third set refers to indicators, permissible values of which are to be found in norms (like velocity RMS from ISO 20816). Starting from the last group, the verification of permissible vibrations is straightforward; therefore, smart analysis seems to be pointless. In case of narrowband indicators, the set is limited, and so is the anomaly detection capability. For wideband indicators, the number of analysis is relatively small, so it is easy to handle them in a classical way.

Recalling the configuration process described in previous section, note that in a classical CMS, for every diagnostic indicator, the system stores permissible Warning and Alarm levels, which generate an event upon trespass. These considerations generate following deduction: if one is able to define diagnostic indicators and corresponding threshold levels correctly (classical way), the system should react properly on the change of the technical condition of the machine; if one is not able to do so, then why believe that more advanced, smart, unsupervised machine learning methods would work at all?

3.4 Maintenance Planning

Every CMS has the very same ultimate purposes, i.e. to protect life and to reduce production costs by providing information about (degradation of) technical condition of the machine. For machine protection systems, this information is sent directly to a SCADA system, and it has a form of a control electrical signal. For the rest of vibration-based systems, this information could be described by its form (high-resolution graph, embedded bar graph, display value, sms, e-mail, sound, light, etc.), its reliability (formalized as “false alert rate”), and its content (numerical value, shape of the graph, text description, pictogram, color change, etc.). For classical CMS, these parameters are well established and well understood, and it might be hence concluded that any improper performance of such system is caused by improper (faulty or incomplete) system configuration or data corruption. For instance, overestimated threshold levels would fail to detect fault. For smart systems, each of described parameters is somehow difficult. Results of many smart methods are in a form of some numerical “rate”, which have connotations with the data, but not with machine elements. The reliability of such methods is hard to determine, because typically they do not operate on predefined scalar threshold level, which requires a subsequent interpreter, which generates clear information. Without such interpreter, it could easily happen that simple set of information generated by a classical architecture would be transformed by a smart system into elaborated, equivocal data.

4 Recommendations for Selection of Suitable System

If the reader has arrived that far in the chapter, the natural reaction would be to ask, WHAT is thus the optimal CMS? It is a proper question, but the answer is quite complex. The selection process is a result of two prior questions, namely what is the monitored machine and what are its failure modes? The first one is whether we need a protection layer or only diagnostics? Are the simplest signal features like rms sufficient or do we need a complex set of dozens of features? The second question should answer what is the level of expertise of the system users? The “smartness” of CMS should first focus on efficiency of commissioning, i.e. installation and configuration. Then, the system should provide timely and sufficient information to its users. As the popular saying goes, it should be as simple as possible, but not simpler.

5 Summary

The paper starts with a concept of a perfect “smart” vibration-based condition monitoring system. Up to now (to the authors’ knowledge), a system which fulfills all the customer needs does not exist. Moreover, there is not any known theory that would justify that it is possible to design a fully automatized CMS. Yet, as given in the paper, CMS providers are racing towards “game changing” systems claiming systems’ smartness where possible. At the same time, it could be found in [6] that regardless of the CMS type, only 5% of collected data is actually analyzed in industrial environment, because the rest of the data is insignificant or corrupted. More details of corrupted data handling are found in [7]. Therefore, the final conclusion from the paper is that although “smart” condition monitoring offers many attractive fruits, it is much more vulnerable to inexperienced, new equipment specialists than classical systems.