Abstract
Intrusion detection is well-known as an essential component to secure the systems in Information and Communication Technology (ICT). Based on the type of analyzing events, two kinds of Intrusion Detection Systems (IDS) have been proposed: anomaly-based and misuse-based. In this paper, three-layer Recurrent Neural Network (RNN) architecture with categorized features as inputs and attack types as outputs of RNN is proposed as misuse-based IDS. The input features are categorized to basic features, content features, time-based traffic features, and host-based traffic features. The attack types are classified to Denial-of-Service (DoS), Probe, Remote-to-Local (R2L), and User-to-Root (U2R). For this purpose, in this study, we use the 41 features per connection defined by International Knowledge Discovery and Data mining group (KDD). The RNN has an extra output which corresponds to normal class (no attack). The connections between the nodes of two hidden layers of RNN are considered partial. Experimental results show that the proposed model is able to improve classification rate, particularly in R2L attacks. This method also offers better Detection Rate (DR) and Cost Per Example (CPE) when compared to similar related works and also the simulated Multi-Layer Perceptron (MLP) and Elman-based intrusion detectors. On the other hand, False Alarm Rate (FAR) of the proposed model is not degraded significantly when compared to some recent machine learning methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
In recent decades, malicious behavior by some Internet users has prompted researchers to work on various intrusion detection techniques. Based on the source of information, two kinds of Intrusion Detection Systems (IDS) have been proposed: host-based and network-based [1]. Host-based IDS is served on host computer and network-based IDS monitors data exchanged between computers. On the other hand, if analyzing of events is considered, two kinds of IDS exist: anomaly-based [2] and misuse-based [3]. Anomaly-based IDS detects activities that differ from established patterns for users, and misuse-based IDS compares users’ activities with the known behaviors of attackers.
Many soft computing approaches have been applied to the intrusion detection field. In this way, the anomaly-based detection techniques can be classified into three main categories: statistical-based [4], knowledge-based [5], and machine learning (e.g., Bayesian networks [6], Markov models [7], Artificial Neural Networks (ANNs) [8, 9], fuzzy logic [10, 11], genetic algorithms [12] and clustering and outlier detection [13]).
The detection techniques that are used in misuse-based IDS can also be classified into three similar categories: statistical-based [14], knowledge-based [15, 16], and machine learning (e.g., Bayesian networks [17], ANNs [18–22], fuzzy logic [10], genetic algorithms [23], clustering [24], decision trees [25, 26], and hybrid systems [27–30]).
In this paper, a reduced-size structure of Recurrent Neural Network (RNN), based on the grouping of features, is used for misuse detection. Due to size reduction in RNN, training speed and convergence are improved. Thus, a fast IDS is reached which is effective in terms of Detection Rate (DR) and Cost Per Example (CPE).
International Knowledge Discovery and Data mining group (KDD) data set [31] is used for training and test of the proposed model in this study. Each connection in KDD is characterized by 41 features and a label which specifies the status of connection records (normal or a specific attack type). These features are used as the inputs of RNN and grouped into four categories: basic features (B-F), content features (C-F), time-based traffic features (TT-F), and host-based traffic features (HT-F). The RNN has five outputs, one of which indicates normal class (no attack). The other four outputs of RNN represent the type of detected attacks: Denial-of-Service (DoS), Probe, Remote-to-Local (R2L), and User-to-Root (U2R). To reduce the size and computational complexity of RNN-based IDS, the nodes of layers are partially connected based on the mentioned four feature categories.
Experimental results show that the proposed model is able to improve classification rate, particularly in R2L attacks. This method also offers better DR and CPE when compared to similar related works and also the simulated Multi-Layer Perceptron (MLP) and Elman-based IDS with the same number of hidden layer nodes. On the other hand, False Alarm Rate (FAR) of the proposed model is not degraded significantly when compared to some recent machine learning methods.
The remainder of this paper is organized as follows. Section 2 provides the KDD data set details. The architecture of the proposed model is introduced in Sect. 3. Simulations and experimental results are reported in Sect. 4. Conclusions are discussed in Sect. 5.
2 KDD intrusion data
In 1999, recorded network traffic from the Defence Advanced Research Project Agency (DARPA) data set was summarized into network connections with 41 features per connection [31]. This data set formed the benchmark provided by KDD. There are four main categories of attacks given in the KDD: DoS, Probe, R2L, and U2R. The KDD data set consists of three components: “10% KDD”, “Corrected KDD”, and “Whole KDD” [31] (Table 1).
As is common in literature, the analysis in this paper is performed on the “10% KDD” data set [21]. Each connection in KDD is characterized by 41 features (listed in Table 2). As mentioned earlier, these features are grouped into four categories: basic features, content features, time-based traffic features, and host-based traffic features.
Basic features can be derived from packet headers without inspecting the payload. In the content features, domain knowledge is used to assess the payload of the original Transmission Control Protocol (TCP) packets. Time-based traffic features are designed to capture properties that mature over a two-second temporal window. Host-based traffic features utilize a historical window estimated over the number of connections, instead of time. Therefore, they are designed to assess attacks that span in intervals longer than 2 s.
3 The proposed model
As mentioned before, a partially connected RNN with two hidden layers is used as misuse-based IDS in this work (Fig. 1). The categorized features defined in Sect. 2 are used as the inputs of RNN. As shown in Fig. 1, the connections between 41 input nodes and first hidden layer nodes are based on the categorization of features. The connections between the nodes of two hidden layers are considered partial. The RNN has five output neurons (representing the normal class and four attack types).
The features in the KDD data sets have different forms (discrete, continuous, and symbolic) with significantly varying resolution and ranges. Most pattern classification methods are not able to process data in such a format. Therefore, some preprocessing is required.
Symbolic-valued features, such as protocol_type (with 3 different symbols), service (with 70 different symbols), and flag (with 11 different symbols) are mapped to integer values ranging from 0 to N−1, where N is the number of symbols. Continuous features having smaller integer value ranges like wrong_fragment [0,3], urgent [0,14], hot [0,101], num_failed_logins [0,5], num_compromised [0,9], num_root [0,7468], num_file_creations [0,100], num_shells [0,5], num_access files [0,9], count [0,511], srv_count [0,511], dst_host_count [0,255], dst_host_srv_count [0,255] are also scaled linearly to the [0,1] range.
For the three features that span over a very large integer range, logarithmic scaling (base 10) is applied. The mentioned features are duration [0,58329], src_bytes [0,1.3 billion], and dst_bytes [0,1.3 billion], the spans of which have been reduced to [0,4.77] and [0,9.11], respectively. Other features are either Boolean, like logged_in, or continuous, like diff_srv_rate, in the range of [0,1]. No scaling is needed for these features. So, each of the mapped features are linearly scaled to the [0,1] range.
4 Experimental results
In this work, 49,402 records from “10% KDD” data set and 31,104 records from “Corrected KDD” data set are used as training and test sets, respectively (Table 3). Except for U2R test samples, the remaining sets have the same distribution, as different categories of attacks corresponding to KDD data sets.
The standard metrics that have been developed for evaluating IDS are DR and FAR as the two most common metrics. DR is computed as the ratio between the number of correctly detected attacks and the total number of attacks, while FAR is computed as the ratio between the number of normal connections that is incorrectly misclassified as attacks and the total number of normal connections. For the purpose of classifier algorithm evaluation, another comparative measure is defined which is Cost Per Example (CPE) [32]. CPE is calculated using the following relation:
where CM and C are confusion matrix and cost matrix, respectively. T represents the total number of test instances and m is the number of classes in classification. CM is a square matrix in which each column corresponds to the predicted class, while rows correspond to the actual classes. An entry at row i and column j, CM(i,j), represents the number of misclassified instances that originally belong to class i, although incorrectly identified as a member of class j. The entries of the primary diagonal, CM(i,i), stand for the number of properly detected instances. Cost matrix is similarly defined, that is to say entry C(i,j) represents the cost penalty for misclassifying an instance belonging to class i into class j. Cost matrix values employed for the KDD classifier learning contest are shown in Table 4 [31].
The confusion matrix and training time of the proposed RNN model are reported in Table 5. The confusion matrices and training times of MLP and Elman-based neural classifiers, with the same number of nodes in hidden layers, are also reported in Table 5.
The performance of the proposed model has been compared to some other machine learning methods, in terms of DR, FAR, and CPE as well (Table 6). As shown in Table 6, the proposed RNN model performs better in terms of DR and CPE. FAR of the proposed IDS is not degraded significantly when compared to some recent machine learning methods.
5 Conclusions
In this paper, a partially connected RNN model with four groups of input features has been proposed as misuse-based IDS. Experimental results have shown that the reduced-size neural classifier has improved classification rates, especially for R2L attack category, when compared to other classifiers. The proposed model shows better performance in terms of DR and CPE when compared to some recent related works. The FAR metric has been improved in comparison with some recent machine learning methods, as well.
References
Sabhnani M, Serpen G (2004) Why machine learning algorithms fail in misuse detection on KDD intrusion detection data set. J Intelli Data Anal 6:1–13
Shon T, Moon J (2007) A hybrid machine learning approach to network anomaly detection. J Infor Sci 177:3799–3821
Chen Y, Abraham A, Yang B (2007) Hybrid flexible neural-tree-based intrusion detection systems. Int J Intell Syst 22:337–352
Ye N, Emran SM, Chen Q, Vilbert S (2002) Multivariate statistical analysis of audit Trials for host-based intrusion detection. IEEE Trans Comput 51:810–820
Garcia-Teodoro P, Diaz-Verdejo J, Macia-Fernandez G, Vazquez E (2009) Anomaly-base network intrusion detection: techniques, systems and challenges. J Comput Secur 28:18–28
Kruegel C, Mutz D, Robertson W, Valeur F (2003) Bayesian event classification for intrusion detection. In: The proceedings of the annual computer security applications conference, pp 14–23
Yeung DY, Ding Y (2003) Host-based intrusion detection using dynamic and static behavioral models. J Pattern Recognit 36:229–243
Cansian AM, Moreira E, Carvalho A, Bonifacio JM (1997) Network intrusion detection using neural networks. In: The proceedings of the international conference on computational intelligence and multimedia applications, pp 276–280
Ramadas M, Ostermann S, Tjaden B (2003) Detecting anomalous network traffic with self-organizing maps. Recent advances in intrusion detection, RAID, Lecture notes in computer science (LNCS) 2820:36–54
Dickerson JE (2000) Fuzzy network profiling for intrusion detection. In: The proceedings of the North American fuzzy information processing society (NAFIPS) international conference, pp 301–306
Gomez J, Dasgupta D (2002) Evolving fuzzy classifiers for intrusion detection. In: The proceedings of the IEEE workshop on information assurance, pp 68–75
Song D, Heywood MI, Zincir-Heywood AN (2005) Training genetic programming on half a million patterns: an example from anomaly detection. IEEE Trans Evol Comput 9:225–239
Sequeira K, Zaki M (2002) ADMIT: anomaly-based data mining for intrusions. In: The proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 386–395
Biermann E, Cloeteand E, Venter LM (2001) A comparison of intrusion detection systems. J Comput Secur 20:676–683
Han SJ, Cho SB (2003) Detecting intrusion with rule-based integration of multiple models. J Comput Secur 22:613–623
Novikov D, Yampolskiy RV, Reznik L (2006) Artificial intelligence approaches for intrusion detection. In: The proceedings of the IEEE conference on systems, applications and technology, pp 1–8
Joshi MV, Agrawal RC, Kumar V (2001) Mining needless in a haystack: classifying rare classes via two-phase rule induction. In: The proceedings of the ACM SIGMOD conference on management of data, pp 91–102
Debar H, Dorizzi B (1992) An application of recurrent network to an intrusion detection system. In: The proceedings of the international joint conference on neural networks, pp 478–483
Kayacik G, Zincir-Heywood N, Heywood M (2003) On the capability of an SOM-based intrusion detection system. In: The proceedings of the international joint conference on neural networks, pp 1808–1813
Golovko V, Vaitsekhovich L, Kochurko P, Rubanau U (2007) Dimensionality reduction and attack recognition using neural network approaches. In: The proceedings of the international joint conference on neural networks, pp 2734–2739
Beghdad R (2008) Critical study of neural networks in detecting intrusions. J Comput Secur 27:168–175
Sheikhan M, Sha’bani AA (2009) Fast neural intrusion detection system based on hidden weight optimization algorithm and feature selection. World Appl Sci J 7(Special Issue of Computer & IT):45–53
Lin Y, Chen K, Liao X (2004) A genetic clustering method for intrusion detection. J Pattern Recognit 37:924–927
Denning DE (1987) An intrusion-detection model. IEEE Trans Softw Eng 13:222–232
Pfahringer B (2000) Winning the KDD 99 classification cup: bagged boosting. J SIGKDD Explor 1:65–66
Levin I (2000) KDD classifier learning contest: LLSoft’s results overview. J SIGKDD Explor 1:67–75
Mukkamala S, Janoski G, Sung AH (2002) Intrusion detection using neural networks and support vector machines. In: The proceedings of the international joint conference on neural networks, pp 1702–1707
Abadeh MS, Habibi J, Lucas C (2005) Intrusion detection using a fuzzy genetic–based learning algorithm. J Netw Comput Appl 30:414–428
Tajbakhsh A, Rahmati M, Mirzaei A (2009) Intrusion detection using fuzzy association rules. J Appl Soft Comput 9:462–469
Sheikhan M, Jadidi Z (2009) Misuse detection using hybrid of association rule mining and connectionist modelling. World Appl Sci J 7(Special Issue of Computer & IT):31–37
KDD Cup 1999 Data. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. Accessed July 2008
Agrawal R, Joshi MV (2000) PNrule: a new framework for learning classifier models in data mining (a case-study in network intrusion detection). IBM research division, report no. RC-21719
Beghdad R (2007) Training all the KDD data set to classify and detect attacks. Neural Netw World 17:81–91
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sheikhan, M., Jadidi, Z. & Farrokhi, A. Intrusion detection using reduced-size RNN based on feature grouping. Neural Comput & Applic 21, 1185–1190 (2012). https://doi.org/10.1007/s00521-010-0487-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-010-0487-0