Keywords

1 Introduction

Network plays an increasingly important role in people’s daily life with the rapid development of network technology. At the same time, the type of network attacks and the number of vulnerabilities are increasing rapidly. Many network and information systems are facing security threats. As more and more privacy are faced with the risk of being leaked, the responsibility of network security is more and more important accordingly. It is needed to evaluate the network security to measure the situation of the network. Evaluation of network security can help security professionals to optimize security configurations. Thus, an evaluation system is needed to solve all of the above problems.

In this paper, we construct a network security evaluation model for network based on attack graph. Firstly, the model is based on attack graph [1,2,3]. Attack graph can generate attack paths to analyze the network vulnerability. It shows users the weak point in the network analysis process for network security risk analysis. Secondly, in this model, the value of security risk is calculated by a function which is based on two parameters that are importance of nodes and the maximum reachable probability of nodes. The nodes in the attack graph have different effects on the security of the network. The higher the importance of node, the greater the impact on the network security of the node. Thus, we quantify the importance of nodes according to the major factors. The maximum reachable probability of nodes is an important factor of network security as well. Then, we construct a security evaluation function which is based on the above two parameters and calculate the security risk value of the network. Finally, security professionals could distinguish the level of security according to the risk value and formulate their countermeasures. The parameters required by our approach are convenient to collect. The computational complexity of our model is relatively low. Thus, our method can be generalized to any network for measuring network security.

The organization of the paper is as follows. We discuss related works in Sect. 2. Then, we describe our model of evaluating the network security in Sect. 3. At Sect. 4, our model is tested based on a simple attack graph. We then present the conclusions in Sect. 5 and acknowledgement respectively.

2 Related Works

2.1 Attack Graph Generation

Various kinds of approaches have been proposed to generate attack graph automatically. Early approaches use network states, which result in the graphs growing exponentially.

Sheyner et al. uses model checking techniques to compute attack graphs [2]. Phillips and Swiler [3] developed a tool for generating attack graphs. Ritchey and Ammann [4] use model checking for vulnerability analysis of networks. X. Ou et al. [5] tried to generate logical attack graph and developed a tool named MulVAL. Now, it is an open source project in Kansas State University. Sheyner et al. [6] use a modified model checker, NuSMV, to produce attack graphs. Although their model could generate all the attack paths, the scalability problem is more serious. Then, P. Ammann et al. [7] introduce the monotonicity assumption into generation process, and reduce the computational cost to polynomial. Jajodia et al. [8] develop a tool named TVA (the Topological Vulnerability Analysis tool). It can analyze network vulnerability automatically and mine the weakness to generate the attack graph.

2.2 Network Security Analysis

Some researchers are also trying to evaluate network security quantitatively based on attack graphs [9,10,11,12]. Many mathematical algorithms have been applied in the field of network security evaluation.

J. Pamula et al. [9] describe a method to measure network security. Their method expresses the targets as the minimal sets of required initial attributes, and the security metric is the strength of the weakest adversary who can successfully penetrate the network. Wang et al. [10] make a further analysis on network metric with attack graphs, and they propose a simple security metric framework, which mainly describes the basic principles and the basic requirements of operators. Then, Wang et al. [11] give a metric example with probability of success, discussing the processing methods on cycles in attack graphs. But unfortunately, their method is suitable for single target, and is hard to describe a network’s security as a whole. M. Frigault et al. [12] interpret attack graphs as special Dynamic Bayesian networks, and their outstanding contribution is considering the effect between the vulnerabilities in a dynamic environment.

The existing methods of network security analysis provide ideas for our work in combining attack graph and mathematical methods. Attack graphs provide the necessary context for correlating and prioritizing intrusion alerts, based on known paths of vulnerability through the network. The method of mathematics can make the evaluation of security risk more accurate.

3 Model Description

3.1 Description of Attack Graph

Our work is not focus on how to construct an attack graph automatically since there are so many articles devoted to this issue. We just analyze network security situation on the basis of the assumption that we have already gotten an attack graph. Here we generate our attack graph by MulVAL [4]. In this paper, the attack graph that we discuss refers to the acyclic attack graph. The definition of attack graph [1,2,3] is as follows.

Definition 1.

The structure of attack graph is a directed graph. It can be defined as \( G = \left( {V_{o} \cup V_{d} ,T,E} \right) \), where the set \( V_{o} \cup V_{d} \) of nodes represent vulnerable system and network configurations, the set T of nodes represent target nodes, the element \( v_{ij} \in E \) in set E is a transition relation from \( v_{i} \) to \( v_{j} \).

Furthermore, attack graph should meet the following conditions: an exploit cannot be realized until all of its previous conditions have been satisfied. A reachable condition can be satisfied if any of its previous exploits are realized.

3.2 The Node’s Importance

The nodes in the attack graph have different effects on the security of the network. Standing on the shoulders of the meaningful results brought by previous works, we access the factors that may impact the node’s importance from the view of attacker and propose a metric named \( TNI \) to measure the importance of the node’s importance level.

Mehta et al. proposed using Google PageRank algorithm to assess importance of nodes in the attack graph [13], which considers mainly the topology and link relations. The PageRank algorithm is a link analysis algorithm and it assigns a numerical weighting to each element of a hyperlinked set of documents. Since the more steps the attack sequence is, the harder to attack success. Attackers prefer to choose the shortest attack path in an attack. Considering the shortest paths, we use betweenness centrality [16] to evaluate the node’s importance level. Betweenness centrality is a measure of centrality in a graph based on shortest paths.

The mathematical descriptions about \( TNI \) are as follows. Firstly, we need to calculate the node’s PageRank value and betweenness centrality respectively. Then, we normalize the above two value in (0, 1) and get the average value of the above two parameters. The \( TNI \) of node \( v_{i} \) is denoted as \( TNI\left( {v_{i} } \right) \):

$$ TNI\left( {v_{i} } \right) = \frac{{PR\left( {v_{i} } \right) + BC\left( {v_{i} } \right)}}{2} $$
(1)

In this paragraph, we will discuss how to calculate PageRank value in detail. We use iteration algorithm to calculate PageRank value. The PageRank value of node \( v_{i} \) at time \( t \) is denoted as \( PR\left( {v_{i} ,t} \right) \):

$$ PR\left( {v_{i} ,t + 1} \right) = \frac{1 - d}{N} + d\sum\limits_{{}} {\frac{{PR\left( {v_{j} ,t} \right)}}{{L\left( {v_{i} } \right)}}} $$
(2)

In (2), \( d \) is a constant; \( M\left( {v_{i} } \right) \) is the set of node that links to node \( v_{i} \); \( L\left( {v_{i} } \right) \) is the number that node links to other nodes. Firstly \( t = 0 \), initialize the PageRank value of each node as \( PR\left( {v_{i} ,0} \right) = \frac{1}{N} \). Then, iterative as (2) until (3) is satisfied. The values of the last iteration are the node’s PageRank value.

$$ \left| {PR\left( {v_{i} ,t + 1} \right) - PR\left( {v_{i} ,t} \right)} \right| \le \varepsilon $$
(3)

In (3), \( \varepsilon \) is a constant that can be adjusted in different situations. The smaller \( \varepsilon \) is, the harder formula convergences.

For every pair of nodes in a graph, there exists a shortest path between the nodes such that the number of edges that the path passes through is minimized. The betweenness centrality for each node is the number of shortest paths that pass through the node. The betweenness centrality of node is denoted as \( BC\left( {v_{i} } \right) \):

$$ BC\left( {v_{i} } \right) = \sum\limits_{s \ne i \ne t} {\frac{{\sigma_{st} \left( {v_{i} } \right)}}{{\sigma_{st} }}} $$
(4)

In (4), \( \sigma_{st} \left( {v_{i} } \right) \) is the number of shortest paths that pass through node \( v_{i} \). \( \sigma_{st} \left( {v_{i} } \right) \) is the total number of shortest paths from node to \( s \) node \( t \). We use the Dijkstra algorithm to deal with the single-source shortest path and calculate the number \( \sigma_{st} \left( {v_{i} } \right) \) and \( \sigma_{st} \).

3.3 The Maximum Reachable Probability of Nodes

In the attack graph, there may be multiple paths from the initial node to the target node. For different attack sequences, attack difficulty is also different. Attacker prefer to choose the easiest path to attack, so the security of the network depends on the safety of its weakest part. We define the probability when attackers choose the easiest path to attack a node as the maximum reachable probability.

If node \( v \in V_{d} \), the maximum reachable probability of node \( v \) can be calculated by its parent node. If node \( c \) is the one of the parent node of node \( v \) and \( P\left( c \right) \) is the maximum reachable probability of node \( v \), then we can get the reachable probability of node from node \( c \) is \( P\left( v \right) = d\left( v \right) * P\left( c \right) \). Here \( d\left( v \right) \) is the access probability of node \( v \). We get the data of access probability from the CVSS based database. For node \( v \), the maximum reachable probability is \( P\left( v \right) = d\left( v \right) * Max\left. {\left\{ {P\left( c \right)\left| {c \in Pre\left( v \right)} \right.} \right.} \right\} \).

If node \( v \in T \), all conditions of it’s parent node must be qualified according to the definition of attack graph. We can deduct the formula contrasting to the initial nodes. The maximum reachable probability is that \( P\left( v \right) = d\left( v \right) * \prod\limits_{c \in Pre\left( v \right)} {P\left( c \right)} \).

For initial nodes and target nodes, we define their maximum reachable probability as follows. The initial nodes represent the initial conditions that an attacker can exploit. The target nodes are the target of attackers attacking the network. It is necessary to calculate the probability of initial nodes for calculating the maximum reachable probability of target nodes.

Definition 2.

In an attack graph \( G = \left( {V_{o} \cup V_{d} ,T,E} \right) \), \( d\left( {v_{i} } \right) \) is the access probability of its own, \( Pre\left( {v_{i} } \right) \) are the parent nodes of node \( v_{i} \). The maximum reachable probability of node \( v_{i} \) is defined as \( P\left( {v_{i} } \right) \): if node \( v_{i} \in V_{d} \), the maximum reachable probability of node \( v_{i} \) is \( P\left( {v_{i} } \right) = d\left( {v_{i} } \right) * Max\left. {\left\{ {P\left( {c_{{}} } \right)\left| {c \in Pre\left( {v_{i} } \right)} \right.} \right.} \right\} \); if node, the maximum reachable probability of node \( v_{i} \) is \( P\left( {v_{i} } \right) = d\left( {v_{i} } \right) * \prod\limits_{{c \in Pre\left( {v_{i} } \right)}} {P\left( c \right)} \).

We can calculate the maximum reachable probability of all nodes according to the Definition 2. To describe the attacker’s multistep attack process, we use the breadth first search algorithm. In order to simulate the attacker’s choice of multiple paths and the limit of longest attack path, we define the longest attack length L and search the best path step by step in our algorithm. The length L represent an attacker’s ability to attack the network and should be revised according to the actual situation. Meanwhile, we define a data structure to record the trace of the attack sequence to avoid the loop path. When a loop path appears, we should find the equivalent path of the loop path instead of simply cancel the loop path. When there are multiple paths, our algorithm will choose the one with the highest success rate according to Definition 2.

The probability of the node own can be queried from the standard CVE (Common Vulnerabilities and Exposures) name [14]. We adapt different methods according to the Definition 2 to get the node’s maximum reachable probability after its parent nodes are calculated. Finally we get the probability of each nodes as all the procedures finished.

3.4 Construct Evaluation Model

To sum up the opinions above, two parameters decide the network security situation: the node’s importance and the maximum reachable probability of nodes in the attack graph. Let \( V\left( {v_{i} } \right) \) be a function to calculate the security risk score of node \( v_{i} \), and then we can see that:

When \( TNI\left( {v_{i} } \right) \) do not change, the higher \( P\left( {v_{i} } \right) \) is, the higher \( V\left( {v_{i} } \right) \) becomes. \( V\left( {v_{i} } \right) \) is proportional to \( P\left( {v_{i} } \right) \).

When \( P\left( {v_{i} } \right) \) do not change, the \( TNI\left( {v_{i} } \right) \) higher is, the higher \( V\left( {v_{i} } \right) \) becomes. \( V\left( {v_{i} } \right) \) is proportional to \( TNI\left( {v_{i} } \right) \).

When \( P\left( {v_{i} } \right) \) and \( TNI\left( {v_{i} } \right) \) do not change, the higher \( \lambda \) is, the more obvious the change of \( V\left( {v_{i} } \right) \) is.

Based on the above judgment and the monotone character of the function, we could construct a security risk function:

$$ V\left( {v_{i} } \right) = \lambda * TNI\left( {v_{i} } \right) * P\left( {v_{i} } \right) $$
(5)

We can also assess the security situation of the whole network. Let V represents a function that calculates security risk score of the whole network. We just add up the risk score to get the risk score of the whole network. Then, we normalize the score in (0, 1) for comparison and evaluation. According to expressions (2), (4) and (5), we conclude that the mathematical model describing network security risk is as follows:

$$ V = log_{N} \left( {\sum\limits_{i \in [1,N]} {e^{{\lambda * TNI\left( {v_{i} } \right) * P\left( {v_{i} } \right) - 1}} } } \right) $$
(6)

The security risk score distinguishes the levels of network security. Using our method, managers could check the network regularly and collect the security risk information. According to the security risk value, administrators are aware of the situation of the whole network, and then take up the corresponding measures, to ensure the safety of the network relatively.

The safe intervals of different network are also different. What is more, calculating different network security scores need different scoring criteria. So, our approach is not suitable for comparing security between different networks directly. But, our method is very suitable for security monitoring and evaluation for the same network. Using our method to deal with the security of the same network, changes of the security risk value represent changes in the level of network security. Managers can judge whether the network is safe after many evaluations.

4 Experiments

In this section, we study a relatively realistic network to validate the rationality and feasibility of the algorithm we propose. The experimental environment are Intel Pentium E5400 (2.70 GHz), 2 GB of memory, Window XP. The algorithm is implemented in Eclipse 3.2.

In the experimental network shown in Fig. 1, there are two hosts in this mini-network, which are a web server and an Apache server. The link-layer connectivity between the two hosts is provided by a switch. In addition, an attacker’s client, is connected to the switch directly. Two firewalls connecting to the switch are used to protect the network respectively. The vulnerabilities which would be exploited by attackers are shown in Table 1. For each vulnerability, the CVE name, CVSS score, vulnerability location, and exploiting probability of success are also described as follows. The CVSS score represent the risk level of single vulnerability. The access complexity represent the success rate of attackers to penetration the corresponding vulnerability.

Fig. 1.
figure 1

Experimental network topology

Table 1. Vulnerability information

The information in Table 1 are obtained from the National Vulnerability Database published by National Institute of Science and Technology (NIST). The vulnerabilities are stored using the standard Common Vulnerabilities and Exposures (CVE) name [15]. For each vulnerability in the database, NVD provides CVSS [16] scores in the range 1 to 10. We use specific numerical values to characterize the access complexity of vulnerabilities. In detail, we use 0.37 to represent the High level of access complexity, 0.61 to represent the Medium level and 0.71 to represent the Low level and undefined level.

Figure 2 shows the attack graph we generated using MulVal. Since the semantics of each node in the graph is too long, we replace them with different letter number. In this attack graph, there are 12 intermediate node and 6 target nodes. We use the algorithm we proposed in this paper to calculate the two parameters and then get the security risk score. The result of TNI and the maximum reachable probability are shown in Table 2.

Fig. 2.
figure 2

Attack graph of the network

Table 2. Node information

The security value of this network is 0.62, which is relatively high. According to the maximum reachable probability of different nodes, administrator know which nodes are easy to be leaked, and then take up corresponding measures, to ensure the safety of the network. When some vulnerability are fixed, the security value will decline sharply. Our methods can be used to check the network regularly and collect the security risk information. By comparing the historical security risk information and the risk value, network administrators are able to understand the security situation of the network.

5 Conclusion

We propose a quantitative method for evaluating network security based on attack graph. We analyze the host information, topology information and vulnerability information of the network, get all possible attack paths and generate the attack graph. In this paper, we construct a network security evaluation model for network based on attack graph. Then, we define and calculate the importance of nodes and the maximum reachable probability of nodes in an attack graph. Finally, we construct the model based on the above two parameters. The approach provides a method to analyze attack paths, compute the security risk value of the network and help security professionals to choose appropriate countermeasures based on conditional decision preferences of relevant factors. Our future work is to optimize the algorithms of attack probability assessment and to test our method on large scale networks.