Keywords

1 Introduction

Network security refers to the protection of various resources from all kind of malicious activities and ensures safety to network and data. It implements the security policies and analyzes various threats and stops it from entering the network. Network security consist of many layers such as firewall and network intrusion detection systems. Firewalls are used to block the access between the networks but it does not study the traffic nor alert the administrator. An intrusion detection system is capable of studying the traffic patterns and compares it against the known attack patterns. It has the capability to inspect the packet contents deeply and protects against network threats.

Software based network intrusion detections are common which can be implemented easily. Snort is an example of open source light weight intrusion detection system [1] which uses signatures to compare thousands of attack patterns. But if the network falls in gigabit speed, it will be difficult for the software to support the system. Hardware solutions solve these issues by converting the rules into Hardware description language. Field programmable gate array is one such hardware which is designed to be configured by a Hardware description language. It consists of many high speed logic blocks capable of parallel processing to produce high performance gain. The general functional model consists of three sections: buffering, packet analyzer and rule matching section. The gigabit network is connected to hardware through Ethernet interfaces. The packets are queued in buffer section to balance between hardware and network. Packets are forwarded to the packet analyzer where it extracts the information from packet. These information are then compared against a set of rules in rule matcher. The alerting and logging mechanism works according to the output of rule matcher.

In this paper, a gigabit network intrusion detection system is designed based on Bloom Filter, to identify the attack patterns in the network. The paper efficiently designs Bloom filter algorithm for string matching engine (SME). Bloom filter test the participation of an element from a group of elements. In this approach, the elements in a group are hashed with multiple hash functions and are stored in the memory. This stored information can be used to check whether a given element belongs to the group or not. The major advantage in Bloom filter is the constant amount of memory needed to store the hashed values irrespective of length of the input element. Also, the amount of computation needed for hash functions and memory lookups are constant thereby making process faster. The designs are based on the concepts of spectral Bloom filters [2] which is an extended version of naive Bloom filter [3] and optimized version of counting Bloom filter [4]. They solve some of the problems such as multi-set query, insertion and deletion to Bloom in real-time which are not possible with original Bloom Filters. In spectral Bloom Filter [2], instead of bit vectors, an array of counters was implemented and is incremented or decremented according to corresponding insertions or deletions.

The rest of the paper is organized as follows: In Sect. 2 the background and related works are given; In Sect. 3, the design implementation of the work and description of the design; In Sect. 4, results and discussions of the experiment and Sect. 5 concludes the paper.

2 Related Works

The high speed Intrusion Detection System is an area of high opportunity, especially in hardware based NIDS. The increase in data speed has led to the requirement of dedicated hardware components and its improvement is necessary for a near perfect product. Roesch [1], developed a tool in 1999 called Snort, that can rapidly find the possible holes in network security. Sidhu and Prasanna [5] focused on methods that could convert the regular expression faster to hardware circuitry. They skipped the conversion to deterministic finite automaton (DFA), directly compiled regular expression to nondeterministic finite automaton (NFA). This implementation was extended in the works of Hutchings, Brad and Franklin [6] in 2002 which proposed a high speed Network Intrusion Detection system. The system extracted regular expression from snort with the help of java code which was then processed by the Xilinx software to create the FPGA input. They proposed an automated compiler that could convert regular expression automatically.

Bloom Filter from Bloom [3] was mainly used for checking the string and database applications. Broder and Mitzenmacher [7] uncovered the various applications of Bloom filters in networking, its modern variants, and the mathematical basis behind them. The main advantage of Bloom filter is that it takes only constant amount of memory to hash each of the elements irrespective of its length and also the computation involved in detecting an element is constant. Dharmapurikar and Krishnamurthy [8] in 2003 proposed a technique with Bloom filter to verify the membership of the queries. The focus was to implement the fast string matching in hardware based Intrusion Detection System. The design consist of bloom filters corresponding to each string length which ranges from a minimum value to size of window. In 2004 [9] they analyzed its performance against the finite automata solutions. Universal Hash function known as H3, is found to be suitable for implementation of hash table in hardware from the study conducted by Ramakrishna et al. [10].

Song and Lockwood [11] in 2005 suggested a method for long string matching to reduce the supported signature length. Three bit extended Bloom filter were chosen because of its scalability and fast incremental updating ability. Fan et al. [4] proposed an extension to bloom filter that could insert and delete from Bloom vector in real time. Cohen and Matias [2] optimized the work for multiset query and introduced two new algorithms. Pontarelli and Bianchi [12] proposed a system where instead of purely distributing the packets across the modules they grouped similar traffic packets and dispatched it to differently capable blocks. The design mainly used the header information to classify it into different categories and each module of hardware processes the disjoint rule sets. They followed a shift and compare architecture which was presented by Baker and Prasanna [13].

3 Design and Implementations

Let a string be processed by some multiple hash functions and they result in some values less than the size of the vector to which hash function map the string. Those values are set as bit positions in vectors. The query procedure is same as programming where the strings are checked for membership. The bit in corresponding values of vector is compared against the hash values of the queried string and if at least one value is found different then it is declared as a non member. The false positive probability f is given by [4]

$$ f = \left( {1 - \left( {1 - \frac{1}{m}} \right)^{nk} } \right)^{k} \approx \left( {1 - e^{{ - \frac{nk}{m}}} } \right)^{k} $$

where m is the size of the vector with n members and k hash functions. In optimal case, for a given value of m and n, we get number of hash functions with minimum false positivity,

$$ k = \left( {\frac{m}{n}\ln 2} \right). $$

The extended Bloom filter replaces the vector with array of counters. The counter corresponding to hash value increases when strings are inserted and decreased when deleted. When an item is queried or deleted it checks only the minimum counter value which should be greater than one if it existed in Bloom. When an item is inserted to Bloom filter, minimum counter is increased which is equivalent to increasing all the counters. The hash function implemented in this paper is the universal hash function, H3. This class of function are found suitable for hardware implementations [14] since they are bound to the limited memory resources. For any bit string X with bits represented as

$$ X = {<}x_{1} ,x_{2} ,x_{3} , \ldots ,x_{b}{>} $$

the ith hash function over X is calculated as,

$$ h_{i} = d_{i1} \cdot x_{1} \oplus d_{i2} \cdot x_{2} \oplus d_{i3} \cdot x_{3} \oplus \cdots \oplus d_{ib} \cdot x_{b} $$

where ‘.’ is a bitwise AND operator and ‘\( \oplus \)’ is a bitwise XOR operator. d ij is a predetermined random number in the range [0 … m − 1].

A second bloom vector of same size known as the reference vector is introduced in this paper. The resultant hash values of multiple hash functions from previous vector are hashed together to get a single value and is set in the reference bloom vector. The reference vector is considered only when there is a chance of false positivity. The reference vector is ignored when the first vector finds the item as a non member. This can further reduce the false positivity. The hash function uses simple XOR of previous hash function values to increase the lookup speed. The Fig. 1 shows the implementation of reference vector in Bloom filter and the hash function is given as

$$ h_{1} = h_{1} (x) \oplus h_{2} (x) \oplus \cdots \oplus h_{k} (x) $$

where x is the string and k is the number of hash functions in first vector.

Fig. 1
figure 1

Reference vector

3.1 Architectural Model

The architectural model of network intrusion detection based on reference Bloom Vector is shown in Fig. 2. It consists of an ethernet interface to which the network is connected. The queue manager record the packets and puts it in order. It allows synchronization between hardware and network. The packet classifier consists of header classifier and payload extractor. The header classifier extracts the header through the layers of protocol and the control is passed to the payload extractor. The information from payload extractor is given to the dispatcher unit which decides the distribution of the packet. It distributes the payload to one of the multiple bloom engines in such a way that the load is uniformly distributed. A Bloom engine consist of many Bloom filters, each having a different input window size. At each clock cycle, one byte is shifted in the window. The Bloom filter compares the resultant hash values of the input with the values stored in the hash table. The number of parallel bloom filters depends on the maximum length of the string to be compared and so window size is set from minimum to maximum length.

Fig. 2
figure 2

Architectural model

Multiple such Bloom engines are considered in order to increase the throughput of the system. These Bloom engines are connected to hash tables stored in the memory. Hash tables consist of stored hashes of attack patterns which are compared with hashed inputs in bloom filters. If the item is found to be a member, then the hash values are forwarded to False Positive Verifier. The reference vector is similar to a Bloom vector, except that the function intakes the Bloom filter hash values. The reference vector is already set with values corresponding to each string. If the comparison with the bloom vector gives a positive result with a certain error probability, the reference vector confirms the result, reducing the error rate. The alerting and logging systems are activated incase the output of reference vector is positive. The alerting mechanism notifies the administrator for the event and logging system saves the collected information to a file.

4 Results and Discussions

FPGA is the most feasible solution for wire-speed implementations. The design mainly focused on the minimum resource utilization in hardware. Different hardware implementable string matching algorithms were selected and analyzed, based on their complexities and features. Table 1 shows a brief comparison between different hardware string matching algorithms. Bloom filter found to be the most compatible algorithm among them. It remembers only the flipping of bits in bit array and do not store any hashed keys. The selection of hash function, thus directly impacts the performance of the hardware. Five different non cryptographic hash functions were chosen for investigation. Universal Hash function, CRC 32, Pearson hash, Bit extraction hashing, XOR hashing were tested with uniformity test and avalanche test. Uniformity test analyze the distribution of hash values. Avalanche test checks whether a small change in input produce a large change in output. Considering the results from tests, H3 class of universal hash function was found better than the rest for hardware implementation.

Table 1 Study of different String matching algorithms

The design is implemented in Xilinx Virtex II Pro with 4,176 Kbit block RAM [15]. Hash table were stored in SRAM with a total capacity of 4.5 Mbytes. Multiple engines with 20 distinct lengths can be scanned at a time. Despite the fact that the rate is kept at the same level as in other comparable works, the error rate is decreased to a small value. The system is designed in such a way that, the reference vector is inquired only after a positive outcome in bloom filter. The value \( \frac{m}{n} \) gives the ratio between vector size and number of elements. The proper selection of m, n and k values can reduce the error probability.

The relationship between them clearly shows the size of bit vector (m) to be greater than the number of elements (n) and larger k values could reduce the errors. The Fig. 3 shows the comparison of false positive probability between old and new proposed system for different values of k with \( \frac{m}{n} = 1 5 \). The graph shows an enormous reduction in error probability making the system highly efficient.

Fig. 3
figure 3

Comparison of new error probability for various k values with \( \frac{m}{n} = 1 5 \)

5 Conclusion

The paper proposed a network intrusion detection system in hardware platform to meet the current network requirements. The system adopted Bloom filter algorithm along with H3 hash function for the generation of bit vector. This paper presents an architecture for fast string matching with the help of multiple bloom engines. Further the design consists of a reference vector, which is meant to reduce the error rate in the system. An analysis of the trade-offs between number of hash functions and false positive probability has been presented. The FPGA based implementation is performed with the help of Xilinx Virtex Pro II FPGA board to support gigabit speed.