Abstract
Inspired by the self/nonself discrimination theory of the natural immune system, the negative selection algorithm (NSA) is an emerging computational intelligence method. Generally, detectors in the original NSA are first generated in a random manner. However, those detectors matching the self samples are eliminated thereafter. The remaining detectors can therefore be employed to detect any anomaly. Unfortunately, conventional NSA detectors are not adaptive for dealing with time-varying circumstances. In the present paper, a novel neural networks-based NSA is proposed. The principle and structure of this NSA are discussed, and its training algorithm is derived. Taking advantage of efficient neural networks training, it has the distinguishing capability of adaptation, which is well suited for handling dynamical problems. A fault diagnosis scheme using the new NSA is also introduced. Two illustrative simulation examples of anomaly detection in chaotic time series and inner raceway fault diagnosis of motor bearings demonstrate the efficiency of the proposed neural networks-based NSA.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Artificial immune systems (AIS), inspired by the natural immune systems, are an emerging kind of soft computing methods [1]. With the distinguishing features of pattern recognition, data analysis, and machine learning, the AIS have recently gained considerable research interest from different communities [2–4]. As an important partner of the AIS, negative selection algorithm (NSA) is based on the principles of maturation of T cells and self/nonself discrimination in the biological immune systems. It was developed by Forrest et al. [5] in 1994 for real-time detection of computer virus. During the past decade, the NSA has been widely applied in such promising engineering areas as anomaly detection [6], networks security [7], milling tool breakage detection [8], and aircraft fault detection [9]. Detectors of the original NSA are usually first generated in a random manner, and undergo the so-called ‘negative censoring’ process thereafter. Only the qualified detectors that do not match the self are selected and used to detect changes/anomaly (nonself) in the fresh input patterns. Unfortunately, with no capability of adaptation, these detectors are not well suited for dealing with real-world problems under the time-varying circumstances. To overcome this drawback as well as improve the performance of conventional NSA detectors, an adaptive neural networks-based NSA is proposed here, the corresponding training algorithm is derived, and its applications in fault diagnosis are also explored.
This paper is organized as follows. The essential principles of the regular NSA are briefly discussed in Sect. 2. In Sect. 3, the neural networks-based NSA with a unique learning algorithm is introduced. A general fault diagnosis framework using the new NSA is constructed in the following section. Simulations of two numerical examples, anomaly detection in Mackey–Glass time series and bearings fault diagnosis, are made in Sect. 5 to verify the proposed scheme. Finally, in Sect. 6, the paper is concluded with some remarks and conclusions.
2 Principles of negative selection algorithm
The natural immune system is an efficient self-defense system that can protect the human body from being affected by foreign antigens or pathogens. One of its important functions is pattern recognition and classification. In other words, the immune system is capable of distinguishing the self, i.e., normal cells, from the nonself, such as bacteria, viruses, and cancer cells. This capability is mainly achieved by two types of lymphocytes: B cells and T cells. Both the B cells and T cells are produced in the bone marrow. However, for the T cells, they must pass a negative selection procedure in the thymus thereafter. Only those that do not match the self proteins of the body will be released out, while the remaining T cells are eventually destroyed there. Such censoring of T cells actually can prevent the immune system from attacking the body’s own proteins.
Forrest et al. proposed the NSA to mimic the aforementioned self/nonself discrimination mechanism of the biological immune system. Their approach can be conceptually described as follows. A self data set containing all the representative self samples is first collected. Next, the candidate detectors in binary strings are randomly generated, and compared with the self set. Like the above negative selection of T cells, only those detectors that do not match any element of this set are retained. Finally, the qualified detectors can be used to detect the nonself or anomaly. It is obvious that effective generation of the detectors is pivotal in the NSA, which depends on a few factors, e.g., the size of the self set, the matching rule between the detectors and self samples, and the detector generation strategy [10]. Particularly, the form of the detectors plays a crucial role here. González compared the performances of three existing detector representation schemes including hyper-rectangles, fuzzy rules, and hyper-spheres in [11]. Unfortunately, conventional NSA detectors represented by either binary strings or real values usually are not adaptive [12, 13]. Therefore, in the next section, a novel neural networks-based NSA is proposed to handle this drawback.
3 Neural networks-based negative selection algorithm
In the new neural networks-based NSA, detectors are built on the structures of three-layer feedforward neural networks, as shown in Fig. 1. \({{\left[ {x_{1}, x_{2}, \ldots, x_{N}} \right]}}\) is the input vector, N is the number of inputs, and \({{\left[ {w_{1}, w_{2}, \ldots, w_{N}} \right]}}\) and \({{\left[ {v_{1}, v_{2}, \ldots, v_{N}} \right]}}\) are the connection weights between Layer 1 & Layer 2 and Layer 2 & Layer 3, respectively. More precisely, in Layer 1, the matching degree d i between x i of \({{\left[ {x_{1}, x_{2}, \ldots, x_{N}} \right]}}\) and w i of \({{\left[ {w_{1}, w_{2}, \ldots, w_{N}} \right]}}\) is calculated in every input node i:
where i = 1,2, ..., N. The hidden node outputs are weighted in Layer 2 by \({{\left[ {v_{1}, v_{2}, \ldots, v_{N}} \right]}}.\) Thus, the final single output of Layer 3, y, can be given:
f(·) is the node function of Layer 2. Similarly with the original NSA, y is compared with a preset threshold λ, and the detector matching error E is obtained:
If for any \({{\left[ {x_{1}, x_{2}, \ldots, x_{N}} \right]}}\) in all the training samples, E > 0, this detector does not match the self. It can, therefore, be used for detecting the nonself.
It should be emphasized in the neural networks-based NSA, both \({{\left[ {w_{1}, w_{2}, \ldots, w_{N}} \right]}}\) and \({{\left[ {v_{1}, v_{2}, \ldots, v_{N}} \right]}}\) are trainable. However, different from the normal gradient descent oriented Back-Propagation (BP) training method [14] that always tries to minimize E, a ‘positive’ learning algorithm is derived here for these two sets of weights. The goal of such training is actually to raise the matching error between the NSA detectors and self samples so that the qualified detectors can be generated. Figure 2 illustrates the principle of this training approach. Let \({{\mathbf{w}}}\) and \({{\mathbf{v}}}\) denote \({{\left[ {w_{1}, w_{2}, \ldots w_{N}} \right]}}\) and \({{\left[ {v_{1}, v_{2}, \ldots, v_{N}} \right]}},\) respectively. Starting from a given point \({({\mathbf{w}}^{*}, {\mathbf{v}}^{*})}\) in the weight-matching error \({(({\mathbf{w}},{\mathbf{v}}) - E({\mathbf{w}},{\mathbf{v}}))}\) space, in order to decrease \({E({\mathbf{w}},{\mathbf{v}})}\) by means of the regular BP learning, weights \({{\mathbf{w}}^{*} }\) and \({{\mathbf{v}}^{*} }\) must be changed by \({\Delta {\mathbf{w}}^{*} }\) and \({\Delta {\mathbf{v}}^{*} },\) which are proportional to the negative gradient value of \({E({\mathbf{w}}^{*}, {\mathbf{v}}^{*})}:\)
where μ is the learning rate. On the other hand, in the new training algorithm of the proposed neural networks-based NSA, \({\Delta {\mathbf{w}}^{*} }\) and \({\Delta {\mathbf{v}}^{*} }\) are chosen to indeed increase \({E({\mathbf{w}},{\mathbf{v}})}\) for appropriate detectors generation and tuning. This is accomplished by employing the positive gradient information, as the solid line in Fig. 2 shows, in the \({\Delta {\mathbf{w}}^{*} }\) and \({\Delta {\mathbf{v}}^{*} }\) calculation. Therefore, there are:
w (k) i and v (k) i are denoted as the instant values of w i and v i at iteration k, respectively. They can be updated:
The above training algorithm has several remarkable advantages. Firstly, it provides a more effective detector generation scheme than the conventional solutions, in which the binary or real-valued detectors are initially generated in a stochastic manner. Secondly, if applied as a postprocessing phase, it can automatically fine-tune those rough detectors. Thirdly, embedded with the advantageous characteristics of adaptation, the neural networks-based NSA is capable of coping with the time-varying anomaly detection problems. The application of this new NSA in fault diagnosis will be demonstrated in Sect. 4.
4 Fault diagnosis using neural networks-based negative selection algorithm
Fault diagnosis methods are crucial in modern industry to guarantee the normal working conditions of plants, such as electrical machines and motors [15]. The anomalies in the feature signals acquired from operating plants are considered to be caused by only incipient faults. Hence, fault diagnosis is converted to a problem of anomaly detection, i.e., self/nonself discrimination, in the characteristic time series, which can be solved by utilizing the new NSA. The proposed fault diagnosis scheme consists of four stages. Firstly, the feature signals from healthy as well as faulty plants are sampled and preprocessed. Secondly, a certain number of eligible detectors, S, are generated based on the negative selection principle. Only the feature signals of healthy plants are used at this step. Thirdly, these detectors are trained with the feature signals of both healthy and faulty plants. In addition to the learning algorithm presented in Sect. 3, a ‘margin’ training strategy is also developed for the neural networks-based detectors. The detector weights are separately adapted in the following two cases (regions), refer to Fig. 3.
-
Case 1 (for the faulty plant feature signals only): if 0 < E < γ (Training Region I in Fig. 3), the detectors are trained using the normal BP learning algorithm to decrease E; otherwise, no training is employed.
-
Case 2 (for the healthy plant feature signals only): if −γ < E < 0 (Training Region II in Fig. 3), the detectors are trained using the above ‘positive’ BP learning algorithm to increase E; otherwise, no training is employed.
γ, the margin parameter, is a minor percentage of λ, e.g., 0.1 λ. Finally, after this weight refinement stage, the detectors can be deployed to detect possible anomalies, i.e., existing faults.
AC and DC motors have been intensively applied in various industrial applications. However, changing working environments and dynamical loading always strain and wear motors, and further cause some incipient faults, such as shorted turns, broken bearings, and damaged rotor bars. These faults can result in serious performance degradation and even eventual system failures, if they are not properly detected as well as handled. Therefore, motor drive monitoring and fault diagnosis are very important and challenging topics in the electrical engineering field [16].
The proposed fault diagnosis method is indeed independent of plants and faults, and can, thus, be a general-purpose solution to a large variety of fault diagnosis problems. Figure 4 shows the structure of the above neural networks-based NSA in motor fault diagnosis, in which there are two main phases involved: detector training phase and fault detection phase. In the detector training phase, the feature signals of both healthy and faulty motors are first collected beforehand. They are split into non-overlapping windows in the signal preprocessing unit, denoted by \({{\left[ {x_{1}, x_{2}, \ldots x_{N}} \right]}}\) in Fig. 1, as the input patterns of the NSA detectors. With the negative selection approach and detector training algorithms, the S qualified detectors are next generated. In the fault detection phase, these detectors are used to detect any anomaly in the feature signals acquired from the operating motors. Fault diagnosis results can be obtained based on the statistics of activated detectors. A detector is considered activated, if E < 0. For the motor time series under examination, the numbers of both correctly activated detectors and incorrectly activated detectors, M and L, are counted. The quantitative performance criterion, fault detection rate η, is defined:
In the next section, the effectiveness of this new neural networks-based NSA and motor fault diagnosis scheme is verified using computer simulations.
5 Simulations
Two representative testbeds, anomaly detection in chaotic time series and inner raceway fault detection of motor bearings, are employed in the simulations.
5.1 Anomaly detection in chaotic time series
As a popular benchmark for intelligent data analysis methods, the Mackey–Glass chaotic time series, x(t), can be generated by the following nonlinear differential equation [17]:
where a = 0.1, b = 0.2, and c = 10. Figure 5 a and b illustrate two different Mackey–Glass time series with τ = 30 and τ = 17, respectively. Suppose the normal Mackey–Glass time series result from τ = 30. The goal of utilizing the neural networks-based NSA here is to detect the anomaly caused by τ = 17. Hence, as in the detector training steps in Sect. 4, two 500-sample data sets are collected from the cases of τ = 30 and τ = 17, respectively, and cascaded together as one compact training set of 1,000 samples. The related NSA parameters are given as follows:
-
number of detectors: S = 100,
-
window width: N = 10,
-
matching threshold: λ = 5.25,
-
margin coefficient: γ = 0.05λ = 0.26.
To validate the neural networks-based NSA, the fresh anomalous Mackey–Glass time series shown in Fig. 5c are used, where those samples from 1 to 500 correspond to τ = 30, and the succeeding 500 samples τ = 17. The anomaly detection results acquired from the untrained (before training) and trained (after training) NSA detectors are demonstrated in Fig. 6a and b, respectively. There are only 100 detector instants illustrated in Fig. 6, since the window width N is 10. Note that the untrained detectors are randomly initialized and generated from the normal Mackey–Glass time series, i.e., the first half of the 1,000-sample training data set. Apparently, before training, there are:
and after training there are:
Although the proposed detector adaptation algorithms decrease the numbers of both incorrectly and correctly activated detectors, η actually has grown from 86 to 97%. In other words, an improved anomaly detection performance can be achieved by using this neural networks-based NSA.
5.2 Inner raceway fault detection of motor bearings
Bearings are indispensable components in rotating machinery. Therefore, appropriate monitoring of their conditions is crucial to ensure the normal operating status of motors [18]. Since defect on the inner raceway is a typical bearings fault, the proposed neural networks-based NSA is examined using this fault detection problem. The bearings feature signals are collected at the sampling frequency of 20 kHz from a vibration sensor mounted on top of the NYLA-K eight-ball bearings. The model of the vibration sensor used is IMI Sensors 601A01. The motor is a three-phase industrial motor of 0.5 horsepower manufactured by the Baldor Electric Company, which has the rotation speed at 1,782 rpm.
Figure 7a and b show the vibration signals from the healthy bearings and those bearings with the existing inner raceway fault, respectively. In the preprocessing unit, for the convenient manipulation with detector matching error E in (3), a small constant 0.08 is added to all the signal samples so that both the training and testing data sets contain only positive values. The parameters of the neural networks-based NSA are:
-
number of detectors: S = 100,
-
window width: N = 5,
-
matching threshold: λ = 4.6,
-
margin coefficient: γ = 0.05λ = 0.23.
The relationship between the fault detection rate η and training epochs is depicted in Fig. 8. Fresh feature signals from faulty bearings are deployed as the verification data. It is clearly visible that η increases with moderate oscillations, when the number of training epochs grows. These oscillations are due to the nature of the gradient descent principle. In the simulations, η is only 79% (M = 27 and L = 7) at the beginning of training, but finally reaches 100% (M = 15 and L = 0) after about 800 iteration steps. This indeed demonstrates the neural networks-based NSA can significantly improve the fault detection rate. Nevertheless, the inner raceway fault detection problem is used here only as a simplified testbed, and many practical details are not considered.
6 Conclusions
In this paper, a neural networks-based NSA is first introduced. Its novel structure is described, and the corresponding learning algorithm is also derived. The applications of the new NSA in fault diagnosis are next discussed. Both simulated and real-world data is employed for the validation of the proposed scheme. Improved anomaly detection and bearings fault detection performances are achieved in numerical simulations. The optimization of those NSA detectors with the clonal selection method is at present under investigation [19]. Furthermore, study of time-varying fault diagnosis problems using this approach will be another interesting topic.
References
de Castro LN, Timmis J (2002) Artificial immune systems: a new computational intelligence approach. Springer, London
Dasgupta D, Attoh-Okine N (1997) Immunity-based systems: a survey. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, Orlando, FL, pp 369–374
Garrett SM (2005) How do we evaluate artificial immune systems?. Evol Comput 13(2):145–178
Dasgupta D (2006) Advances in artificial immune systems. IEEE Comput Intell Mag 1(4):40–49
Forrest S, Perelson AS, Allen L, Cherukuri R (1994) Self-nonself discrimination in a computer. In: Proceedings of the IEEE symposium on research in security and privacy, Los Alamos, CA, pp 202–212
Stibor T, Mohr P, Timmis J, Eckert C (2005) Is negative selection appropriate for anomaly detection? In: Proceedings of the genetic and evolutionary computation conference, Washington D.C., pp 321–328
Dasgupta D, González F (2002) An immunity-based technique to characterize intrusions in computer networks. IEEE Trans Evol Comput 6(3):281–291
Dasgupta D, Forrest S (1995) Tool breakage detection in milling operations using a negative selection algorithm. Technical Report CS95-5, Department of Computer Science, University of New Mexico, Albuquerque, NM
Dasgupta D, KrishnaKumar K, Wong D, Berry M (2004) Negative selection algorithm for aircraft fault detection. In: Proceedings of the 3rd international conference on artificial immune systems, Catania, Sicily, Italy, pp 1–13
Ji Z, Dasgupta D (2006) Applicability issues of the real-valued negative selection algorithms. In: Proceedings of the genetic and evolutionary computation conference, Seattle, WA, pp 111–118
González F (2003) A study of artificial immune systems applied to anomaly detection. Ph.D. Dissertation, Division of Computer Science, University of Memphis, Memphis, TN
González F, Dasgupta D, Nino LF (2003) A randomized real-value negative selection algorithm. In: Proceedings of the 2nd international conference on artificial immune systems, Edinburgh, UK, pp 261–272
Stibor T, Timmis J, Eckert C (2005) A comparative study of real-valued negative selection to statistical anomaly detection techniques. In: Proceedings of the 4th international conference on artificial immune systems, Banff, Alberta, Canada, pp 262–275
Haykin S (1998) Neural networks, a comprehensive foundation, 2nd edn. Prentice-Hall, Upper Saddle River
Frank PM (1990) Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy—a survey and some new results. Automatica 26(3):459–474
Chow MY (1997) Methodologies of using neural network and fuzzy logic technologies for motor incipient fault detection. World Scientific Publishing, Singapore
Mackey MC, Glass L (1977) Oscillation and chaos in physiological control systems. Science 197:287–289
Li B, Chow MY, Tipsuwan Y, Hung JC (2000) Neural-network-based motor rolling bearing fault diagnosis. IEEE Trans Ind Electron 47(5):1060–1069
Gao XZ, Ovaska SJ, Wang X, Chow MY (2006) Clonal optimization of negative selection algorithm with applications in motor fault detection. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, Taipei, Taiwan, pp 5118–5123
Acknowledgments
This research work was funded by the Academy of Finland under Grant 214144. The authors would like to thank the anonymous reviewers for their insightful comments and constructive suggestions that have improved the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gao, X.Z., Ovaska, S.J., Wang, X. et al. A neural networks-based negative selection algorithm in fault diagnosis. Neural Comput & Applic 17, 91–98 (2008). https://doi.org/10.1007/s00521-007-0092-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-007-0092-z