False Alarms Management by Data Science

Peco Chacón, Ana María; García Márquez, Fausto Pedro

doi:10.1007/978-3-319-95651-0_15

Ana María Peco Chacón³ &
Fausto Pedro García Márquez³

2534 Accesses
6 Citations

Abstract

Due to the development of control system technology over the last years, the number of sensors has increased dramatically and the configuration of alarms in control systems has become easier. It leads to a large number of alarms and increased operator workload. Industrial plants are currently underperforming due to alarm flood, which can cause minor, or even catastrophic, incidents. The businesses are demanding data science to avoid this, it is necessary to use process and alarm data. The industrial plants must understand the entire process and they count on the experience of the operator. It has been considered that collaborative research between academic world and industry should be undertaken to prevent flooding of alarms, both in normal and transitory conditions. New guidelines, standards and scientific/academic research should be developed. Nowadays new statistical, analytical and mathematical tools are being implemented for alarm detection, and the role of the operator must also be taken into account for correct alarm flood resolution. It will lead to a future with safer and more cost-effective industrial systems.

Access provided by Autonomous University of Puebla. Download chapter PDF

Diagnosis of Alarm Systems: A Useful Tool to Impact in the Maximization for Operator’s Effectiveness at Power Plants

Adaptive Design of Uni-Variate Alarm Systems Based on Statistical Distance Measures

Performance Monitoring of Industrial Plant Alarm Systems by Statistical Analysis of Plant Operation Data

1 Introduction

The industrial standard ISA-18.2 (2009) [1] determines that “an alarm system is the collection of hardware and software that detects an alarm state, this communicates the indication of that state to operators, and it records changes in the alarm state”. Most of business have modern computerized monitoring system to control safety and efficiency the alarms.

Alarms can be also defined as:

A false alarm is an alarm that is reported when there is no fault.
A nuisance alarm occurs when it is true but redundant, i.e. the operator receives more than one alert about that alarm.
A missed alarm is the opposite of a false alarm, it occurs when there is a fault in the system and no alarm has been activated.
A chattering alarm performs many transitions between normal and abnormal state, it continuously crosses the alarm limit thresholds.

Chattering alarms and alarm flood could be confused, however they are not the same. Alarm flood is when the operator receives many alarms in a short period of time. The alarm flood can be caused by the correlation between variables and, therefore, several alerts are triggered at the same time. The chattering alarm is a single alarm that continuously crosses the alarm threshold, and it generates many alerts. The chattering alarms are defined as those alarms that repeat more than 3 times in a minute according ISA-18.2 (2009) standard [1].

Detection delay is also a key concept. It occurs when the alarms are not activated instantly when the failure occurs. It can occur by the same delay caused by the system itself (deadband, delay-timer, etc.).

In all industrial systems, there are several sensors and actuators for detecting and controlling possible faults. These components can create false alarms, therefore, the control system will be inefficient and the performance will be reduced [2, 3].

Fault detection is an important research area, from the academic and industrial point of view. Numerous methods of detection and control have been designed and developed for fault detection. Some systems can prioritise depending on the gravity of the alarm. When the alarm is triggered, the operator must acknowledge it, to understand it and to know the cause of the alarm, in order to assess its significance and to act to return the operation to its normal state. According to Engineering Equipment and Materials Users Association (EEMUA, 2007) [4], for an operator to respond adequately to an alarm, he must dedicate 10 min to it, i.e. he should not receive more than 6 alarms per hour for a correct operation of the system.

Currently operators receive a large number of alarms, sometimes more than they need or can handle (alarm flooding). It can distract the operator until critical alarms are ignored. Therefore, some operators are reluctant to the monitoring control system. If the system causes many false positives alarms, then it must be redesigned to reduce the number of false and annoying alarms. According to the references [2, 3], most of the alarms received by industrial plant operators are false. There are several methods to improve the alarm system, such as multivariable data analysis or the use of filters, the most important of which are discussed in this chapter.

A few decades ago, only a few selected variables could be controlled. They had to be important for the proper performance of the system and control the quality of the process, because of the alarms were difficult to implement. Each alarm had to be connected by a wire from the sensor to the control room, and it had a high cost. In addition, the control room had limited space and it had to contain numerous control devices. For these reasons, the alarms had to be well designed, to be considered reliable and to guide the operators as it was a good indicator of the correct functioning of the production system.

Nowadays, due to the development of hardware and software, a large number of alarms can be implemented at low cost. Many process variables can be measured and stored in databases. The alarm system communicates with the operator by the Human Machine Interface (HMI) or an Annunciator Panel. Many variables are continuously available in the operator panel for monitoring control. This leads to many alarms, some of them false, chattering or nuisance alarms. Hollifield et al. claim that chattering alarms are the most common type of alarm, where they found 70% of all alarms [5].

Table 1 summarized the number of alarms produced in 39 industrial plants. The alarms have been classified according to their industrial sector and time of occurrence.

Table 1 Performance metrics of industrial alarm system, study carried out in 39 industrial plants [6]

Full size table

According to Walker et al. [7], the U.S. business loses $13 billion a year due to improper use of alarms. Nevertheless, the costs generated by false alarms are difficult to quantify in the world, but this are estimated to be billions of dollars every year. The unnecessary stoppages cause a significant loss of production. For this reason, the alarm systems should prevent the damage to equipment, downtime and reduced production. Furthermore, the process systems should be controlled to improve the efficiency, availability quality and reliability of the production process [8,9,10].

There is a great deal of research on alarm management, but the behaviour of the operator in the event of an anomaly is rarely studied. Hu et al. analysed the actions of the operators in response to univariate alarms [11]. The alarm system must clearly and precisely indicate to operators which processes require further attention. They conclude that the operators must regulate the control devices to solve the anomalies of the process, therefore, they can suppress an alarm to temporarily ignore it, in the case of annoying alarms, or they can change the state of the device if an alarm has occurred.

The alarm systems should be designed to help operators to regulate processes and to manage anomalies. Several guides have been written such as ISA and EEMUA for the design, implementation and maintenance of alarm systems [1, 4]. Alarms must be used to ensure the safety of alarm systems and processes. The actions of the operators will depend on the severity of the faults or anomalies that must be announced by alarms. The alerts can be visual or audible.

Monitoring systems are essential to ensure the reliability of the operation of industrial systems [12, 13]. Future hybrid methods provide more robust models in modern and complex installations. Research aims to reduce large production losses and high repair costs due to an inadequate alarm system [14,15,16].

Many problems are involved in alarm systems, Izadi et al. shown the most common causes [17]: improperly designed alarms, mis calibrated equipment, oscillations in general, changes in status during switching off or on are not taken into account, noise and/or outliers are not considered.

The objective of this chapter is to illustrate the methods and techniques used in several sectors to implement an optimal alarm system. The aim is to obtain: higher quality, higher performance, lower production costs, reduce breakdowns and make the processes safer.

2 Confusion Matrix

An optimal alarm system provides the necessary tools for operators to detect faults and take corrective action to return the process to normal condition. In practice, the alarm control system may be faulty or poorly calibrated, therefore, it will not give correct results. A missed alarm is set when the value of the variable suffers a deviation, e.g. surpass threshold, but the system does not detect it. The opposite case is a false alarm, when the system generates an alarm, although it has not actually occurred, also called as false positive.

Signals can give these two types of errors due to threshold selection. If the threshold setting is very strict to avoid the probability of a missed alarm, this will make the system more sensitive to random noise and the transient deviations and it will lead to more false alarms. On the other hand, if we increase the threshold, the number of false alarms will decrease at the cost of producing more missed alarms. The selection of the threshold is therefore decisive for system reliability. In the majority of cases, missed alarms are considered more important than false alarms, because its consequences may be greater. A basic tool to visualize false alarms in contrast to missed alarms is the confusion matrix.

A confusion matrix, or also called a contingency table, is an evaluation tool for categorical statistical data [18]. The table determines whether the value supplied by the alarm system matches the actual value. The rows of the matrix are the response alarm system and the columns are the actual values. There are 4 possible cases: if the classifier is positive and the system indicates alarm is true positive (TP), if an alarm has not occurred but the system classifies it as such, it is false positive (FP), otherwise an alarm has occurred, and the system does not identify it therefore will be false negative (FN), or missed alarm. Finally, it can happen that no alarm occurs and the negative system, therefore it is true negative (TN). The main diagonal values show when the system has acted correctly. However, the values of the other diagonal show when an error has occurred (Table 2).

Table 2 Confusion matrix

Full size table

The following rates are obtained from the confusion matrix.

$$ FP\,rate = \frac{FP}{N} = \frac{negatives\,incorrectly\,classified}{Total\,negatives} $$

$$ Precision = \frac{TP}{TP + FP} $$

$$ Accuracy = \frac{TP + TN}{all\,cases} $$

$$ Sensitivy = \frac{TP}{TP + FN} $$

$$ Specificity = \frac{TN}{FP + TN} $$

3 Process and Alarm Data

The methods for alarm detection are used to improve the efficiency of the global process. However, a large amount of data must be provided to use these techniques.

There are two types of data that are fundamental to the management the alarm system:

Process data: these are measurements of process variables at regular intervals, these are stored in a database and they provide information for the identification of the optimal alarm system.
Alarm data: these are messages generated by the distributed control system (DCS), and they are stored in an alarm log.

This data is important to analyse, it can help to know the causes of the current alarm system overload. It is important to compare industrial data in a real environment with methods or techniques that are developed academically. For instance, Wang et al. explored the main factors behind this problem and they concluded that [19]: the chattering alarms frequently occur due to noise/disturbance, the alarm variables are incorrectly configured, the alarm design is isolated from related variables and the abnormality of the data is transmitted due to physical connections.

System performance and the alarm management lifecycle should be evaluated such as the runtime concept. Kondaveeti et al. [20] offers a tutorial to the alarms chatter, these are difficult to identify due to the poor design or incorrect configuration of the alarm method. A Chatter index is proposed to reduce the effort to identify and quantify chattering alarms. In reference [21], a quantitative measure is proposed to estimate the degree of chattering. The method for evaluating the chatter index is based on alarm parameters and statistical properties of the process variable. Process data is divided into approximate distribution characteristics, and each distribution is estimated separately. The distribution of process data is obtained by adding all run length distributions together. A mathematical function developed by analytical methods is intended to reduce chattering alarms.

Hu et al. proposed a framework for the combination of causality inference using process data and alarm data, and thus it helps to the operator to reduce the alarm flood [22]. Alarm data can be used to identify root alarm labels, and it reduces alarms that require attention. Root cause and effect analysis can be used to detect root cause alarms. The random relationships can be detected by extracting the process variables associated with the root alarms. Finally, the root cause can be confirmed thanks to the causal map of the process variables and some knowledge of the process. The number of alarms is reduced with the method, since only root alarm tags are alerted. The operators can know the root cause quickly, because the causal relationship is detected. Process and alarm industrial data were applied, and the results presented good performances.

4 Alarm Flood

According to International Electrotechnical Commission (IEC, 2014) [23], an alarm flooding occurs when alarms appear on the control panels at a faster rate than the operator can manage them. It leads to determine the root cause of the alarm and the optimal control of the system.

A flood alarm is usually triggered by a primary event and its consequential events [24]. The root cause alarms should be distinguished from consequent alarms to reduce the number of alarms. The alarm data allows to make a list of the primary alarms and the process variables related to them. In addition, the causal relationships between the alarms are obtained with the alarm data. Subsequently, the process data will help to support or discern the root cause analysis.

The historical alarm data allows to use a new analysis method to eliminate alarm flooding [25]. These data are grouped according to a base of alarm occurrences. The alarm floods have similar patterns. If these patterns are analysed and classified, then this method can lead to the root cause of an anomaly. Therefore, the operator will have fewer false alarms and he will be able to react better to flooding alarms. Hu et al. applied a fast sequence alignment method to speed up the calculation and improve the computational efficiency of the algorithms [26]. The method is intended to be more sensitive to higher priority alarms, and it tends to ignore alarms that occur simultaneously to avoid flooding alarms. Through the set-based comparison is reduced unnecessary calculations by irrelevant alarm tags. The results obtained in industrial cases show that the method is faster than the existing algorithms and, therefore, the operators have more time to perform the correct operation and correct this failure.

An alarm that performs repeated transitions between the normal and the abnormal state is called a chattering alarm. This is mainly due to signal noise and because of the variable operates near the alarm limit. The chattering alarms cause many false alarms. It is proposed to redesign the control system, and that these alarms be eliminated by grouping. Consecutive alarms in a cluster are displayed spaced in a narrow time window, then become a single alarm. And only one alarm message will be sent to the operator for a single cluster when the alarm appears. This is a simple method to reduce alarm flooding.

Rodrigo et al. [27] are based on the previous line of work. They claim that by combining the alarm logging, analysing process data and connectivity, alarms can be grouped together, and their root alarm identified. Figure 1 shows the workflow to reduce the alarms flood.

The first step is remove chattering alarms, according to reference [25], the minimum permissible interval should be 10 min. If the elapsed time is shorter the second alarm is eliminated.

In the next step, the alarm log is divided in intervals of 10 min. An alarm threshold is set. It must be more than 10 alarms per time interval and per operator. Consecutive intervals are merged with more alarm occurrences than the defined threshold.

Using sequence pattern matching, the alarm flood sequences are grouped together. In this case, the method described in reference [28] is based on a modified Smith-Waterman (MSW) algorithm. Although, other algorithms can be applied, such as agglomerative hierarchical clustering (AHC).

The fourth step consists of grouping the flood alarm sequences, and a set of templates is created to cancel out the anomalies of all the clusters in the process.

Perhaps the last step is the most complicated, it should be noted that the causal alarm cannot be the first alarm, because when an alarm is triggered, it depends on the alarm setting limits. The time elapsed between the anomaly occurring and the alarm being triggered is probabilistic. Later, some algorithms are applied to determine the root cause of the alarm. There are many papers where different algorithms are applied [29, 30], the best algorithm will depend of the case study.

In summary, to reduce flooding alarm it is used: an alarm log, historical process data and connectivity analysis, to group the different alarms and determine the causal alarm.

There is no single solution to improve the alarm system, therefore, there are different workflows with various processes. For instance, there are signal-baseyvd methods, in this case the process variables are monitored and compared with thresholds (called alarm limits). They are currently the most widely used techniques in the industry and these are implemented in many modern distributed control systems (DCS).

There are also many classifications for alarm systems, some of the techniques applied are threshold design, data processing, multivariate process monitoring, model-based process monitoring, state-based priority setting [31,32,33].

Other classification of alarm systems depends on their design, that can be univariate and multivariate. Within the univariate design are: The alarm threshold; dead band; delay-timer, and; filtering (see Fig. 2). They are individually designed for each variable. In the multivariate design, alarms are combined linearly from various process variables.

Alarm flooding is difficult to suppress with delay timers or dead bands due to consequence alarms. Lai and Chen present an algorithm (extension of) for optimal alignment of multiple flood alarm sequences to obtain a common pattern of them [28, 34]. This new technique needs the following points: Similarity scoring functions; dynamic programming equation; tracking and alignment generation. They propose to develop new algorithms for combining online alarm messages with a database of patterns to alert operators in case of alarm flooding.

Data-driven method [35], concretely historical alarm data, is also employed to detect frequent patterns of alarm flooding. The results showed that the method is effective in finding patterns and reducing pattern redundancies. The holistic view of alarms is also employed for an intuitive understanding of alarm patterns.

The alarm flood sequence alignment (AFSA) methods provide fault inference from the assessment of the similarity of alarm sequences. Guo et al. proposed a new AFSA method, the match-based accelerated alignment (MAA), which analyses the alarm coincidences [36]. It is important because its alignment results reveal to a large extent the real similarity of the alarm floods.

The alarm flood is a problem for the alarm system. There are several methods and techniques to avoid it, where the main ones are discussed in the following sections.

5 Long Standing Alarm

The long-standing alarms have several different definitions, for example ISA-18.2 defines them as “an alarm that remains in the alarm state for an extended period of time (e.g. 24 h)” [37]. According to EEMUA, 2013, an active alarm is considered a long duration alarm for a complete operating shift [38]. In general, the long-lasting alarm, as its name suggests, has a long alarm duration, but the authors do not agree on the thresholds for this time. In this chapter, three main causes of the generation of these alarms are indicated:

Due to the modern computerized monitoring system, alarms are easily created by entering trigger point values, often implemented without special care and generate many misconfigured alarms.
It is often not taken into account the start-up states, the average rate, etc., that have different demands and, therefore, different operating states, and are qualified as alarms when in fact they are not, e.g. when the equipment is switched off.
The process variables experience variations in different states, but the alarm trigger points are constant. It would be interesting to compare the alarm variables with the measurements of the process variables and thus generate new alarm thresholds.

6 Graphical Methods

Alarm data display tools method are employed to detect the annoying alarms [39, 40], e.g. the High Density Alarm Plot (HDAP) and Alarm Similarity Color Map (ASCM). These graphical tools have proven their usefulness in identifying the chattering alarms.

HDAP presents the highest alarms for a given time. It is recommended to choose a sample size of 10 min, to follow the recommendations of the acceptable announcement rate according to EEMUA. This tool allows to emphasize through colour, for example red will show unacceptable chatter behaviour [41].

ASCM enables to be highlighted correctly, related and redundant alarms. This tool shows the alarms reorganized in terms of their similarity and time of occurrence. It depends on the time of analysis, number of higher alarms, type of union in the construction of the bunches and the method of arrangement of the leaves. This tool displays the data in a color-coded matrix and this allows the identification by groups of related alarms, which provide information on the interactions of the process.

Graphical representations provide valuable feedback to improve the alarm system and thus reduce false alarms. For example, Yang et al. used the pseudo data map according to [42]: (1) it is robust to false, missed and chattering alarms; (2) informs whether there is a positive or negative correlation and the similarity; (3) The pseudodata can be used in other statistical analyses to contrast the results obtained. The method consists of the following phases:

(a)
The Gaussian kernel method is applied, and the binary alarm data generates continuous pseudo time series.
(b)
A correlation colour map of pseudodata, or transformed data, is used for showing the set of correlated variables.
(c)
Statistical methods are applied to find redundant alarm labels, or to group correlated alarms.

There are several difficulties to apply this method, such as parameter adjustment, the graph is sensitive, i.e. it requires some degree of freedom to optimize the display of the graph. However, it has been shown that this method is better than the alarm similarity colour map as long as the parameters are set properly.

7 Univariate Alarming Methods

The methods most commonly used are univariate alarming methods for alarm systems [43]. These methods are used because the information they show about a single signal is simple and clear, and operators can make decisions easily. However, for more complex alarms are needed other techniques such as multi-setpoint settings, mobile window, neural network method, etc. [44].

The most important univariate alarming methods are shown in Fig. 2

7.1 Alarm Filtering

The use of filters is widespread in real life because of they can be used for different proposes, for example: eliminating erroneous or undesirable data, reducing noise, extracting data characteristics, modifying the statistical distribution of data, grouping data according to their frequency. The most popular filters are the moving average, the exponentially moving average (EWMA) and the cumulative sum. Izadi et al. presented filters used to improve the receiver operating characteristic curve (ROC) [45].

Filtering techniques for alarm systems presents some disadvantages, e.g. measured by false alarm rate (FAR), missed alarm rate (MAR) and expected detection delay (EDD). Tan et al. [46] have worked with rank order filters to avoid the disadvantages. They have achieved two approaches when the PDF (probability density function) of raw data is known: performance curves of this filters can be calculated directly and can be estimated the EDD, that is impossible for general filters. The experimental results have shown that the order of the filters offers a degree of freedom for the system design, and other if it is considered the size of the window. These results are limited to univariate alarms. Therefore, it is recommended to work with multivariate systems.

The accuracy is given by the false alarm rate, and the efficiency is related to the detection delay and the complexity of the methodology used [47]. Cheng et al. used a method to create an optimal filter design with the aim of improving performance [48]. The optimum performance curve leads in this case that the moving average filter is better than the linear filters. The authors propose as future work to study the performance of the generalized medium filter to obtain a robust optimal filter design method. Izadi et al. consider filtering, alarm delay or deadband to be simple techniques that can reduce annoying alarms and FAR [45].

7.2 Alarm Delay-Timer

Filters use a continuous function transformation, while alarm delay timers are the transformation of discrete functions. The timers are used for their simplicity and efficiency. They can reduce the FAR and MAR, but their disadvantage is that they suffer from a delayed response.

The main elements for univariate alarm design are: the set point; dynamic order, and; alarm algorithm. Su et al. proposed an alarm method with multiple setpoint delay timers [43]. This achieves a balance between accuracy and sensitivity of the alarm system by providing direct transitions from each delay timer sub-state to the alarm state. FAR, MAR and the averaged alarm delay (AAD) are reduced by this methodology. Xu et al. study the efficiency of a univariate system using FAR, MAR and AAD, with emphasis on the calculation of these rates [49]. The proposed method was applied to an industrial case, concluding that it can be used for power and petrochemical plants. Zang et al. employed an improved delay timer method, where the univariate alarm was configured with multiple commands and set points [50]. These timers had an alarm announcement set point and an alarm end set point over conventional alarm timers. Enhanced alarm timers have more design parameters, but present better performance according to the Markov chain. Markov chains are generally employed for random phenomena, being simple mathematical models. It applies to systems that are particularly dependent, as the state of the n + 1 observation system depends only on the state of the system, i.e. changes in the system depend on the current state and not on the way it has been reached. Adnan et al. showed that the delay timers provide flexibility in the design of alarms [51]. The use of the delay timers is a common practice in the industry as it is a simple technique to reduce FAR, MAR and EDD.

Noise is one of the causes of chattering alarm. If a signal is well defined by its period and amplitude, but it contains noise and the noise is large enough to cross over the trigger point many times, then a chattering alarm occurs. Wang and Chen have proposed an online method to detect and reduce chattering alarms due to oscillation [52]. The presence of oscillation can be determined through a revised chattering index and a method based on discrete cosine transform. Therefore, it is used an alarm setting or delay timer is used to reduce alarms. Wang and Chen [53] proposed a rule for detecting talking alarms caused by random noise, and other for repetitive alarms based on the duration and interval of alarms and by regular patterns. It uses the online method and the sample delay timer m to eliminate flicker and repeat alarms. The effectiveness of the method was tested using 3 industrial examples and according to FAR, MAR and AAD (Fig. 3).

8 Multivariate Alarming Methods

Some methods set the alarm limits by studying the correlations between the process data and the alarm data [54]. The multivariate statistical process control (MSPC) is a methodology that is applied for monitoring in many manufacturing processes [55]. It basically consists of three steps:

(1)
The process is under normal operating conditions, historical data are collected and stored in the database, and a statistical model is developed.
(2)
The control limits are fixed for the statistical model.
(3)
If the online data exceeds the control limits, it will be qualified as a process failure.

Historical process data is subjected to multivariable statistical techniques to determine the control limits of the statistics of the study variables, if the actual values exceed the control limit, then the point will be qualified as “out of control”. This involves detection of faults, being the next step is to identify the root cause of the process fault [56].

False alarms can appear by different causes, where the failure of the alarm system and random effects are two of the main causes. System deficiency may be due to the difference between the statistical model and the real process. The random effects also may cause false alarms. There are some online-fluctuation being monitored in the process. They can cause actual variables to deviate from nominal values, and even though the process is working correctly, these false alarms can occur. Many authors have researched using a statistical approach to avoid randomly induced false alarms [57,58,59,60], e.g. Bernoulli, Binomial distributions, conventional method based on principal component analysis (PCA) [61], etc. However, the real variables of the process tend to be self-related, therefore, the approaches of modelling of time series are needed.

One of the main methods of multivariate analysis is the correlation method. In many processes, one variable can be affected by other variable or several variables, i.e. different alarm thresholds generate different alarm data and then different correlations. To optimize these multivariate alarm thresholds, numerous statistical methodologies or algorithms have been applied to demonstrate interactions between variables and determine correlated key variables for the optimization of alarm thresholds, grouped as:

Grouping Variables
Correlation Methods
Advance Methods
Intrusion Detection System (IDS).

9 Conclusions

An optimal alarm system should inform and guide, and each alarm should have a defined response and adequate time to allow the operator to respond adequately to that alarm. Alarms must be relevant, unique, prioritized and understandable. The alarm system must identify the alarm, sort it, set priorities and finally alert the operator if necessary, visually or audibly.

Due to the study of false alarms, it is concluded that three of the most important reasons for their existence are: (1) the process undergoes state changes such as switching on and off, this is set that abnormality and it propagates owing to physical connections; (2) the alarms are poorly configured and have redundant measurements, and; (3) exist causal relationships between the variables studied and alarm design is isolated from related variables.

There are many classifications on alarm systems, since depending on how they treat the information, the type of study variable, the algorithms applied, etc.

There are many types of alarm systems are used in the industry, however, false, annoying or chattering alarms have not yet been completely eliminated. Although many resources are devoted to this problem, an optimal solution has not yet been achieved. It will be possible to improve these methods by means of dynamic systems where the historical data provide feedback capable of handling the process correctly, due to the development of new technologies and the increase in data processing capacity.

References

ANSI. (2009). ISA-18.2-2009 management of alarm systems for the process industries. Durham, NC, USA: International Society of Automation.
Google Scholar
Jiménez, A. A., Gómez Muñoz, C. Q., & García Márquez, F. P. (2018). Dirt and mud detection and diagnosis on a wind turbine blade employing guided waves and supervised learning classifiers. Reliability Engineering and System Safety.
Google Scholar
Munoz, J. C., Márquez, F. G., & Papaelias, M. (2013). Railroad inspection based on ACFM employing a non-uniform b-spline approach. Mechanical Systems and Signal Processing, 40, 605–617.
Article Google Scholar
EEMUA. (2007). Alarm systems: A guide to design, management and procurement. Engineering Equipment and Materials Users Association.
Google Scholar
Hollifield, B. R., & Habibi, E. (2010). Alarm management: A comprehensive guide: Practical and proven methods to optimize the performance of alarm management systems. ISA.
Google Scholar
Rothenberg, D. H. (2009). Alarm management for process control: A best-practice guide for design, implementation, and use of industrial alarm systems. Momentum Press.
Google Scholar
Walker, B., Smith, K. D., & Kekich, M. D. (2003). Limiting shift-work fatigue in process control. Chemical Engineering Progress, 99, 54–57.
Google Scholar
Gómez Muñoz, C. Q., Arcos Jimenez, A., García Marquez, F. P., Kogia, M., Cheng, L., Mohimi, A., & Papaelias, M. (2017). Cracks and welds detection approach in solar receiver tubes employing electromagnetic acoustic transducers. Structural Health Monitoring. https://doi.org/10.1177/1475921717734501.
Article Google Scholar
Gómez Muñoz, C. Q., García Marquez, F. P., Lev, B., & Arcos, A. (2017). New pipe notch detection and location method for short distances employing ultrasonic guided waves. Acta Acustica United with Acustica, 103, 772–781.
Article Google Scholar
de la Hermosa González, R. R., García Márquez, F. P., & Dimlaye, V. (2015). Maintenance management of wind turbines structures via MFCS and wavelet transforms. Renewable and Sustainable Energy Reviews, 48, 472-482.
Article Google Scholar
Hu, W., Al-Dabbagh, A. W., Chen, T., & Shah, S. L. (2016). Process discovery of operator actions in response to univariate alarms. IFAC-PapersOnLine, 49, 1026–1031.
Article Google Scholar
Severson, K., Chaiwatanodom, P., & Braatz, R. D. (2016). Perspectives on process monitoring of industrial systems. Annual Reviews in Control, 42, 190–200.
Article Google Scholar
Marquez, F. G. (2006). An approach to remote condition monitoring systems management.
Google Scholar
Arcos Jiménez, A., Gómez Muñoz, C. Q., & García Márquez, F. P. (2017). Machine learning for wind turbine blades maintenance management. Energies, 11, 13.
Article Google Scholar
García Márquez, F. P., Muñoz, G., Quiterio, C., Papelias, M., & Arcos Jiménez, A. (2015). A heuristic method for detecting and locating faults employing electromagnetic acoustic transducers.
Google Scholar
Roberts, C., Márquez, F., & Tobias, A. (2010). A pragmatic approach to the condition monitoring of hydraulic level crossing barriers. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit, 224, 605–610.
Article Google Scholar
Izadi, I., Shah, S. L., Shook, D. S., & Chen, T. (2009). An introduction to alarm analysis and design. IFAC Proceedings Volumes, 42, 645–650.
Article Google Scholar
Landgrebe, T. C., & Duin, R. P. (2008). Efficient multiclass roc approximation by decomposition via confusion matrix perturbation analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 810–822.
Article Google Scholar
Wang, J., Yang, F., Chen, T., & Shah, S. L. (2016). An overview of industrial alarm systems: Main causes for alarm overloading, research status, and open problems. IEEE Transactions on Automation Science and Engineering, 13, 1045–1061.
Article Google Scholar
Kondaveeti, S. R., Izadi, I., Shah, S. L., Shook, D. S., Kadali, R., & Chen, T. (2013). Quantification of alarm chatter based on run length distributions. Chemical Engineering Research and Design, 91, 2550–2558.
Article Google Scholar
Naghoosi, E., Izadi, I., & Chen, T. (2011). Estimation of alarm chattering. Journal of Process Control, 21, 1243–1249.
Article Google Scholar
Hu, W., Chen, T., Shah, S. L., & Hollender, M. (2017). Cause and effect analysis for decision support in alarm floods. IFAC-PapersOnLine, 50, 13940–13945.
Article Google Scholar
IEC. (2014). IEC 62682 management of alarm systems for the process industries. International Electrotechnical Commission (IEC).
Google Scholar
Timms, C. (2009). Hazards equal trips or alarms or both. Process Safety and Environmental Protection, 87, 3–13.
Article Google Scholar
Ahmed, K., Izadi, I., Chen, T., Joe, D., & Burton, T. (2013). Similarity analysis of industrial alarm flood data. IEEE Transactions on Automation Science and Engineering, 10, 452–457.
Article Google Scholar
Hu, W., Wang, J., & Chen, T. (2015). Fast sequence alignment for comparing industrial alarm floods∗. IFAC-PapersOnLine, 48, 647–652.
Article Google Scholar
Rodrigo, V., Chioua, M., Hagglund, T., & Hollender, M. (2016). Causal analysis for alarm flood reduction. IFAC-PapersOnLine, 49, 723–728.
Article Google Scholar
Cheng, Y., Izadi, I., & Chen, T. (2013). Pattern matching of alarm flood sequences by a modified smith–waterman algorithm. Chemical Engineering Research and Design, 91, 1085–1094.
Article Google Scholar
García Márquez, F. P., Chacón Muñoz, J. M., & Tobias, A. M. (2015). B-spline approach for failure detection and diagnosis on railway point mechanisms case study. Quality Engineering, 27, 177–185.
Article Google Scholar
García Márquez, F. P., Pliego Marugán, A., Pinar Pérez, J. M., Hillmansen, S., & Papaelias, M. (2017). Optimal dynamic analysis of electrical/electronic components in wind turbines. Energies, 10, 1111.
Article Google Scholar
García Márquez, F. P., Pedregal, D. J., & Roberts, C. (2015). New methods for the condition monitoring of level crossings. International Journal of Systems Science, 46, 878–884.
Article Google Scholar
García Márquez, F. P., & Chacón Muñoz, J. M. (2012). A pattern recognition and data analysis method for maintenance management. International Journal of Systems Science, 43, 1014–1028.
Article Google Scholar
de la Hermosa Gonzalez, R. R., García Márquez, F. P., Dimlaye, V., & Ruiz-Hernández, D. (2014). Pattern recognition by wavelet transforms using macro fibre composites transducers. Mechanical Systems and Signal Processing, 48, 339–350.
Article Google Scholar
Lai, S., & Chen, T. (2017). A method for pattern mining in multiple alarm flood sequences. Chemical Engineering Research and Design, 117, 831–839.
Article Google Scholar
Hu, W., Chen, T., & Shah, S. L. (2018). Detection of frequent alarm patterns in industrial alarm floods using itemset mining methods. IEEE Transactions on Industrial Electronics.
Google Scholar
Guo, C., Hu, W., Lai, S., Yang, F., & Chen, T. (2017). An accelerated alignment method for analyzing time sequences of industrial alarm floods. Journal of Process Control, 57, 102–115.
Article Google Scholar
Stauffer, T., Sands, N., & Dunn, D. (2010). Alarm management and ISA-18–a journey, not a destination. In: Texas A&M Instrumentation Symposium.
Google Scholar
Ávila, S., & Pessoa, F. (2015). Proposition of review in EEMUA 201 and ISO standard 11064 based on cultural aspects in labor team, lng case. Procedia Manufacturing, 3, 6101–6108.
Article Google Scholar
Kondaveeti, S. R., Izadi, I., Shah, S. L., Black, T., & Chen, T. (2012). Graphical tools for routine assessment of industrial alarm systems. Computers and Chemical Engineering, 46, 39–47.
Article Google Scholar
Kondaveeti, S. R., Izadi, I., Shah, S. L., & Black, T. (2010). Graphical representation of industrial alarm data. IFAC Proceedings Volumes, 43, 181–186.
Article Google Scholar
EEMUA. (1999). Alarm systems: A guide to design, management and procurement. Engineering Equipment and Materials Users Association London.
Google Scholar
Yang, F., Shah, S. L., Xiao, D., & Chen, T. (2012). Improved correlation analysis and visualization of industrial alarm data. ISA Transactions, 51, 499–506.
Article Google Scholar
Su, J., Guo, C., Zang, H., Yang, F., Huang, D., Gao, X., et al. (2018). A multi-setpoint delay-timer alarming strategy for industrial alarm monitoring. Journal of Loss Prevention in the Process Industries, 54, 1–9.
Article Google Scholar
Jiménez, A. A., Gómez Muñoz, C. Q., García Marquez, F. P., & Zhang, L. (2017). Artificial intelligence for concentrated solar plant maintenance management. In: Proceedings of the Tenth International Conference on Management Science and Engineering Management, pp. 125–134. Springer.
Google Scholar
Izadi, I., Shah, S. L., Shook, D. S., Kondaveeti, S. R., & Chen, T. (2009). A framework for optimal design of alarm systems. IFAC Proceedings Volumes, 42, 651–656.
Article Google Scholar
Tan, W., Sun, Y., Azad, I. I., & Chen, T. (2017). Design of univariate alarm systems via rank order filters. Control Engineering Practice, 59, 55–63.
Article Google Scholar
García Márquez, F. P. (2010). A new method for maintenance management employing principal component analysis. Structural Durability and Health Monitoring, 6, 89–99.
Google Scholar
Cheng, Y., Izadi, I., & Chen, T. (2013). Optimal alarm signal processing: Filter design and performance analysis. IEEE Transactions on Automation Science and Engineering, 10, 446–451.
Article Google Scholar
Xu, J., Wang, J., Izadi, I., & Chen, T. (2012). Performance assessment and design for univariate alarm systems based on FAR, MAR, and AAD. IEEE Transactions on Automation Science and Engineering, 9, 296–307.
Article Google Scholar
Zang, H., Yang, F., & Huang, D. (2015). Design and analysis of improved alarm delay-timers. IFAC-PapersOnLine, 48, 669–674.
Article Google Scholar
Adnan, N. A., Cheng, Y., Izadi, I., & Chen, T. (2013). Study of generalized delay-timers in alarm configuration. Journal of Process Control, 23, 382–395.
Article Google Scholar
Wang, J., & Chen, T. (2013). An online method for detection and reduction of chattering alarms due to oscillation. Computers and Chemical Engineering, 54, 140–150.
Article Google Scholar
Wang, J., & Chen, T. (2014). An online method to remove chattering and repeating alarms based on alarm durations and intervals. Computers and Chemical Engineering, 67, 43–52.
Article Google Scholar
Pliego Marugán, A., García Márquez, F. P., & Lev, B. (2017). Optimal decision-making via binary decision diagrams for investments under a risky environment. International Journal of Production Research, 55, 5271–5286.
Article Google Scholar
Peres, F. A. P., & Fogliatto, F. S. (2018). Variable selection methods in multivariate statistical process control: A systematic literature review. Computers and Industrial Engineering, 115, 603–619.
Article Google Scholar
Gómez Muñoz, C. Q., García Márquez, F. P., & Sánchez Tomás, J. M. (2016). Ice detection using thermal infrared radiometry on wind turbine blades. Measurement, 93, 157–163.
Article Google Scholar
Abraham, B., & Chuang, A. (1993). Expectation-maximization algorithms and the estimation of time series models in the presence of outliers. Journal of Time Series Analysis, 14, 221–234.
Article Google Scholar
Chen, T., & Sun, Y. (2009). Probabilistic contribution analysis for statistical process monitoring: A missing variable approach. Control Engineering Practice, 17, 469–477.
Article Google Scholar
Singhal, A., & Seborg, D. E. (2000). Dynamic data rectification using the expectation maximization algorithm. AIChE Journal, 46, 1556–1565.
Article Google Scholar
Chen, T. (2010). On reducing false alarms in multivariate statistical process control. Chemical Engineering Research and Design, 88, 430–436.
Article Google Scholar
García Márquez, F. P., & García-Pardo, I. P. (2010). Principal component analysis applied to filtered signals for maintenance management. Quality and Reliability Engineering International, 26, 523–527.
Article Google Scholar

Download references

Acknowledgements

The work reported herewith has been financially supported by the Spanish Ministerio de Economía y Competitividad, under the Research Grants RTC-2016-5694-3 and DPI2015-67264-P.

Author information

Authors and Affiliations

Ingenium Reseach Group, University of Castilla-La Mancha, Ciudad Real, Spain
Ana María Peco Chacón & Fausto Pedro García Márquez

Authors

Ana María Peco Chacón
View author publications
You can also search for this author in PubMed Google Scholar
Fausto Pedro García Márquez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ana María Peco Chacón .

Editor information

Editors and Affiliations

ETSI Industriales de Ciudad Real, University of Castilla-La Mancha, Ciudad Real, Spain
Fausto Pedro García Márquez
LeBow College of Business, Drexel University, Philadelphia, PA, USA
Benjamin Lev

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Peco Chacón, A.M., García Márquez, F.P. (2019). False Alarms Management by Data Science. In: García Márquez, F., Lev, B. (eds) Data Science and Digital Business. Springer, Cham. https://doi.org/10.1007/978-3-319-95651-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-95651-0_15
Published: 05 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95650-3
Online ISBN: 978-3-319-95651-0
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

False Alarms Management by Data Science

Abstract

Similar content being viewed by others

Diagnosis of Alarm Systems: A Useful Tool to Impact in the Maximization for Operator’s Effectiveness at Power Plants

Adaptive Design of Uni-Variate Alarm Systems Based on Statistical Distance Measures

Performance Monitoring of Industrial Plant Alarm Systems by Statistical Analysis of Plant Operation Data

1 Introduction

2 Confusion Matrix

3 Process and Alarm Data

4 Alarm Flood

5 Long Standing Alarm

6 Graphical Methods

7 Univariate Alarming Methods

7.1 Alarm Filtering

7.2 Alarm Delay-Timer

8 Multivariate Alarming Methods

9 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

False Alarms Management by Data Science

Abstract

Similar content being viewed by others

Diagnosis of Alarm Systems: A Useful Tool to Impact in the Maximization for Operator’s Effectiveness at Power Plants

Adaptive Design of Uni-Variate Alarm Systems Based on Statistical Distance Measures

Performance Monitoring of Industrial Plant Alarm Systems by Statistical Analysis of Plant Operation Data

1 Introduction

2 Confusion Matrix

3 Process and Alarm Data

4 Alarm Flood

5 Long Standing Alarm

6 Graphical Methods

7 Univariate Alarming Methods

7.1 Alarm Filtering

7.2 Alarm Delay-Timer

8 Multivariate Alarming Methods

9 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation