1 Introduction

Researchers and industry practitioners use threat models [1], vulnerability assessments [2, 3], and asset-oriented risk assessments [4] both in IT and OT enterprise networks to model adversary behavior and predict malicious activities [5]. These tools are often used together or in isolation. Efficient threat models reflect multiple, complex attacks and produce meaningful results to be used by security officers to secure enterprise networks and mitigate existing cybersecurity risks.

A major issue when creating threat models is the inherent complexity of interactions between company services and systems. Modern companies utilize complex system interconnections that include cloud systems, microservices and virtual services (e.g., Kubernetes/Docker in cloud environments [6]) to support everyday business. Therefore, it is important to utilize efficient tools and techniques able to automatically model and analyze such enterprise networks, risks and configurations and highlight potential risks and underlying vulnerabilities.

Modern research has mostly focused on automatic tools that model networks and detect vulnerabilities. Proposed software automatically maps potential vulnerabilities and connected systems to form attack paths, where potential malicious actions are tied to detected vulnerabilities and systems to form attack paths. Such models put IT staff in the adversary’s position and have been proven crucial for security officers during risk mitigation [5, 7].

Still, threat models and attack graphs find it difficult to interpret and adequately analyze attacks on modern Industry 4.0 systems. Modern enterprise networks have more complex connections and corporate networks combine multiple systems and service environments with seemingly disjoint configuration issues [8]. Microservices and service-oriented architectures structure corporate systems as collections of loosely coupled services over virtual systems and Docker images. Other implementations structure enterprise networks over multi-cloud environments, with dedicated tunneling between each cloud.

In this context, security officers struggle to prioritize and remediate security findings provided by the assessments and solutions mentioned above. Existing automatic threat model tools and attack graph generators have come a long way toward automatically modeling and mapping complex enterprise networks, tackling scalability and detecting causality relations between adversary actions and system states, but suffer when interpreting produced models and prioritizing targets and mitigation solutions.

1.1 Contribution

This paper extends previous work on automatic attack graph generation and modeling to provide a new mathematical approach in analyzing and understanding attack graphs. Our methodology does not focus on improving upon attack graph generation research, but it picks up from where previous studies have ended to analyze already-generated attack graphs. We provide a novel solution that automatically proposes network security solutions for risk mitigation and prioritizes security control implementation on large-scale and complex networks in Industry 4.0. Tested systems include modern Docker and cloud environments. The proof-of-concept tool detects the highest risk attack paths and offers a metric analysis of existing vulnerability effects on the overall enterprise network. To our knowledge, no similar research can automatically analyze complex attack graphs and offer prioritization solutions for risk mitigation.

The idea is to use Edmonds’ algorithm, graph centrality metrics and clustering on attack graphs weighted with risk assessment calculations to provide automated prioritization of systems and detected vulnerabilities. Logical attack graphs model potential attacks over interconnected system configuration states and their derived events. We produce a spanning arborescence and groups of tightly interdepended malicious events and states. This way, we can extract data to:

  1. 1.

    Prioritize detected attack paths based on their overall risk to the enterprise network,

  2. 2.

    Pinpoint present system configurations that introduce the highest risk on the overall system, and

  3. 3.

    Use Clustering techniques to detect system states with attack patterns that often contain functionally related vulnerabilities.

We use this output to propose which system states, vulnerabilities and configurations have the biggest overall risk to the ecosystem while considering every potential sub-attack path and subliminal path on an attack graph. We develop a standalone software and test its efficiency on two real-world use cases: one multi-cloud enterprise network and a NetFlixOSS microservices Docker architecture.

1.2 Structure

Section 2 briefly presents research publications that are relevant to our work and compares our contributions to existing literature. Section 3 introduces the main building blocks used in our methodology. Section 4 presents the algorithmic steps of our methodology, along with input and output for each step. In Sect. 5 we discuss our experiments and present our findings to validate the methodology. Section 6 concludes our work and focuses on current limitations, the limits of our current contribution and potential future challenges.

2 Related work

Various forms of attack graphs have been proposed for analyzing and evaluating the security of industry and corporate networks [7, 9,10,11,12,13]. Attack graphs can be categorized in two major types. In Phillips and Swiler [12], Sheyner et al. [13] authors consider each node that represents an entire system state and edges represent state transitions caused by a propagating attack, these types of attack graphs often are identified as state enumeration attack graphs [14]. In Ammann et al. [9], Ingols et al. [10], Noel et al. [11], Noel and Jajodia [14] authors identify each node as a system condition, not as an entire system state, in some form of logical sentence, and edges as the causality relations between the system conditions. These types of attack paths are recognized as dependency attack paths [15].

A broad range of publications exists that deals with attack paths and attack graph generation [9, 11, 12, 16, 17]. Advances on the sector enabled processing of attack graphs for complex networks of thousands of machines [7, 10]. While some focus on graph scalability issues [9, 11, 18], others focus on automatically generating detailed attack scenarios to depict potential adversary routes inside enterprise networks [19].

Sheyner et al. [13] utilize a finite-state machine model checker to calculate multi-stage, multi-host attack paths on a network in the form of a scenario graph. Still, their approach scales poorly against graph generation time and graph size [20]. Following a similar approach, authors in Phillips and Swiler [12] applied model checking with a custom search engine to conduct the analysis of attack graphs, facing the same scalability problems.

Ammann et al. [9] address the scalability problem of the model checking-based attack graph methodologies by utilizing the monotonicity characteristic, where an attacker does not need to relinquish privileges he already gained since his ability to attack does not diminish. They implemented their algorithm in the Topological Vulnerability Analysis tool and provided a tangible understanding of how individual and combined vulnerabilities impact overall network security [16]

In Musa et al. [21], authors utilizing organization vulnerability assessments effectively model and produce attack graphs to quantitatively assess and analyze the attacks performed on the computing networks. Ivanov et al. [22] present an automated system based on a comprehensive method that includes calculation of security indicators, risk assessment and selection of protective measures, based on attack graphs for assessing the security risks in the smart infrastructure and choosing the protective measures. In Al Ghazo et al. [23], authors propose a model-checking-based automated attack graph generator and visualizer in order to analyze how interdependencies among existing vulnerabilities may be exploited by an adversary to stitch together an attack that can compromise a system. In Ibrahim et al. [24], authors present the Attack Scenarios Generation and Filtration Tool (ASGFT) which automatically generates all possible attack scenarios utilizing the description of an industrial control system.

In Ou [25] authors present the MulVal framework that utilizes a different attack tree mapping based on logical statements. They named their models “logical attack graphs” and utilize Datalog (a subset of Prolog) to express system configuration information as Datalog tuples and attack techniques and OS security semantics as Datalog rules. Their model uses two kinds of nodes: derivation nodes and fact nodes. Fact nodes are further divided into primitives and derived facts, while edges in their tree represent dependencies between these logical constructs. However, their approach fell short in terms of scalability even in medium sized network [7]. Authors in Ou [7] based on the work of [25] proposed a tool that has the ability of generating complete attack graphs for networks with thousands of machines by utilizing the monotonicity characteristic to validate the polynomial time of similar attack trees that use system configuration information. More recent research on “logical attack graphs” presents several alternate improvements on MulVal framework addressing several assumptions and limitations [26].

Recent advances have enabled computing attack graphs for large and complex networks [9, 10, 26]. However, even when attack graphs can be efficiently computed, the resulting size and complexity of the graphs is still too large and complex for a human to fully comprehend [27, 28]. Even for relatively small networks, produced attack graphs are still complex and difficult to understand and analyze for network administrators and security officers.

While network administrators and security officers utilizing attack graphs will quickly understand that attackers can penetrate the network, it is extremely challenging to identify which privileges, vulnerabilities, and assets are the most important/critical to the attacker success. Network administrators and security officers require a tool, which can automatically process the immense amount of information into a simple list of priorities, proposing specific risk mitigation actions that will help them to secure the network at hand fast, making efficient use of often limited human and financial resources [29].

Our proposed method utilizes multiple algorithms to achieve the analysis of such attack graphs (both logical and automatically generated): We implement (i) a previous methodology on automatic attack graph generation and modeling [7, 19, 25, 26], (ii) the risk dependency analysis for attack paths from [30, 31], and (iii) the clustering concept from [32], utilizing centrality metrics [33, 34].

While solutions proposed in Ammann et al. [9], Ibrahim et al. [24], Ingols et al. [10], Ivanov et al. [22], Lippmann and Ingols [20], Musa et al. [21], Ramadhan et al. [26] provide automatic modelling, mapping, and analysis of complex networks through attack path generation, still they lack the ability to automatically suggest mitigation solutions and prioritization. Our solution can also automatically analyze attacks graphs but continues ahead in providing solutions for risk mitigation and prioritization, detect highest risk attack paths, and offer metric analysis of existing vulnerability effects on the overall enterprise network addressing issues and limitations network administrators and security officers are facing [29]. In addition, our implementation is capable of handling and analyzing large and complex attack graphs addressing issues of scalability [9, 10, 26].

3 Building blocks

3.1 CVE metrics for risk estimation

Common Vulnerabilities and Exposures (CVE) is a database of entries that contains information on publicly known cybersecurity vulnerabilities on vendor systems and services. According to CVE, “CVE Entries are used in numerous cybersecurity products and services from around the world, including the U.S. National Vulnerability Database (NVD)” (Common Vulnerability and Exposures [35] (2020). The NVD is “the U.S. government repository of standards-based vulnerability management data” (National Vulnerability Database [36] (2020).

NVD entries on publicly recorded CVEs utilize the CVSS 2.0 Severity and Metrics scoring system. The Common Vulnerability Scoring System (CVSS) provides a quantitative algorithm that captures key characteristics of a vulnerability and produces numerical scores that reflect each vulnerability’s Severity (i.e. impact on a system) and Exploitability (ease of use from malicious users). These metrics are in line with international and industry standards on measuring the risk of a cybersecurity attack and underlying vulnerability [4, 37]. The common reference of risk as a cybersecurity assessment metric is the following Eq. 1:

The presented method calculates the Overall Attack Graph Risk as follows:

$$ {\text{Risk}} = {\text{Likelihood* Severity}} = = \left( {\text{Threat* Vulnerability}} \right) * {\text{Severity}} $$
(1)

In our case, we model attack paths based on CVE vulnerabilities that can be exploited in each step until the attacker reaches his goal. By applying risk measurements on CVE paths, we can calculate the risk of each attack path step using the aforementioned definition, where the feasibility of a threat and vulnerability is reflected by the CVE’s Exploitability Subscore and Severity of an attack is reflected by CVE’s Impact Subscore. This way, each attack graph connection that depicts an attacker’s use of an exploit can have a quantifiable Risk metric.

3.2 Attack graphs

Attack path mapping (APM) is a methodology that identifies the highest risk assets in a corporate network and prioritizes controls, mitigations, and remediations by “mapping and validating all routes an attacker could use to reach a target” [8]. Attack paths depict information flow on a company’s interdepended assets. Together they create an attack graph. Attack graphs map potential attacks in a system based on its configuration and detected vulnerabilities. They are a concise representation of all possible attack paths through a system that ends in a state where an intruder/attacker has successfully achieved his goal [38]. They depict ways that an adversary can exploit system vulnerabilities to achieve a desired state.

NIST proposes the use of attack graphs for forensic analysis and to aid investigators and security officers in identifying attack scenarios and pinpoint necessary countermeasures for mitigation (pre-attack) and evidence acquisition (post-attack) [19]. Attack graphs identify vulnerabilities in a network and how potential attackers can exploit these.

Following the conceptual modeling of [18], our attack risk graphs utilize derivation nodes and graph edges. In the presented methodology, each node represents a potential system state; i.e., a malicious action that produces a specific system state proven able to happen, since previous logical dependencies leading up to that derived fact are true. Nodes are the result of applying interaction rules iteratively on facts (represented by edge attributes).

A directed edge illustrates the dependency of a system state (node) \( V_{j} \) on another \( V_{i} \), i.e. \( V_{i} \to V_{j} \). Edge dependencies effectively construct attack traces with information to construct a logical dependency path [7]. Each edge depicts a different derivation, so the number of edges is equal to all possible states’ derivations from observed system configurations [7, 18, 19]. Edges represent logical dependencies between potential system states and contain logical requirements as attributes. These attributes reflect the preconditions for an attacker to realize a step/achieve a system state. Attributes can either be configuration primitives (an implemented system configuration state) or derived facts detected during the analysis of primitives (e.g., vulnerability CVE-2019 exists on a web server). Primitives are generally configuration information of systems, as reported by the host and network scanners (e.g., “access control list granted” that indicates that a firewall permits access to a server)

3.2.1 Attack graph reduction

A graph reduction is used on the modeled attack multigraph to produce a simple graph. A weighted directed multigraph is a graph with multiple edges with the same start and end nodes. We reduce an attack multigraph by replacing all the given edges \( E_{i,j} , \left\{ {i,j} \right\} \in V \) between nodes with the respective optimum one, thus producing a simple weighted graph \( G^{\prime} \) with \( V^{\prime} = V \) nodes and \( E' \subseteq E \) edges. In a simple weight graph, each edge connects two distinct nodes, and no two edges connect the same pair of nodes. Also, the sum of the weights of all the edges in the reduced simple weighted graph should be optimum. Depending on the problem at hand, maximum or minimum weight defines the aforementioned optimum edge or graph. Based on risk assessment consensus and the fact that edge (attacks) weights depict risk, we consider optimum the edge with the maximum weight (worst-case scenario).

In case multiple maximum weight edges between two nodes exist, the reduction process can produce multiple alternates, simple graphs with the same overall weight. The produced graph represents possible attacks with the highest risk (maximum), and as such, we choose one of them for further analyses. We utilize graph reduction to detect and remove low-risk attacks (minimum weight edges) between connected assets and thus producing a disjunctive attack tree(s) with the higher overall risk.

For our implementation of the proposed reduction process, as presented in Algorithm 1, we utilize a simple iterative algorithm, which iterates over all graph edges and uniquely stores the edge with the maximum weight for each pair of connected nodes. This way, we remove low-risk attacks (minimum weight edges) between connected assets. The running time of our implementation of the suggested algorithm for graph reduction is \( O\left( {\left| E \right|^{2} } \right) \), where \( \left| E \right| \) the number graph edges. After finding a reduced graph with the maximum weight, we can find all the alternative ones by simply altering edges with those with the same source and target node pair, and weight from the original graph.

3.3 Attack paths and risk chains

A multi-risk dependency analysis algorithm [39, 40] is used on the graph model. In our implementation, an edge denotes a derivation, i.e.,\( V_{i} \to V_{j} \); thus it inherits a risk relation that is derived from a dependence of state \( V_{j} \) on an accessible/available vulnerability provided by state \( V_{i} \). Based on risk assessment standards [4, 37], the methodology quantifies the risk of each graph edge using the impact \( I_{i,j} \), and the likelihood \( L_{i,j} \) of a vulnerability being exploited. The product of these two values is defined as the dependency risk \( {\text{R}}_{i,j} \) of system state \( V_{j} \) due to its dependence on state \( V_{i} \). The numerical value of each edge is the level of the cascade risk between the receiver and the sender node. This risk is depicted using a risk scale [1–10] where 10 is the most severe risk.

The algorithm assesses the nth-order cascading risks or attack paths using a recursive algorithm based on [31, 39]. If \( S_{1} \to S_{2} \to \cdots \to S_{n} \) is an nth-order dependency between n system states \( S \), with weights \( R_{i,i + 1} = L_{i,i + 1} I_{i,i + 1} \) corresponding to each first-order dependency of the attack path, then the cascading risk \( R_{1, \ldots ,n} \) exhibited by \( S_{n} \) for this state dependency path is computed as shown in Eq. 2.

The presented method calculates the Cascading risk of a system state dependency path as follows:

$$ R_{1, \ldots ,n} = L_{1, \ldots ,n} I_{n - 1,n} = \left( {\mathop \prod \limits_{i = 1}^{n - 1} L_{i,i + 1} } \right)I_{n - 1,n} $$
(2)

The cumulative dependency risk (Eq. 3) is the overall risk exhibited by all the system states in the sub-chains of the nth-order dependency. If \( S_{1} \to S_{2} \to \cdots \to S_{n} \) is a chain of system states dependencies of length n, then the cumulative dependency risk, denoted as \( CR_{1, \ldots ,n} \), is defined as the overall risk produced by an nth-order dependency:

The presented method calculates the Cumulative dependency risk of an nth-order dependency as follows:

$$ CR_{1, \ldots ,n} = \mathop \sum \limits_{i = 1}^{n} R_{1, \ldots ,i} = \mathop \sum \limits_{i = 1}^{n} \left( {\mathop \prod \limits_{j = 1}^{i - 1} L_{j,j + 1} } \right)I_{i - 1,i} $$
(3)
figure a

Equation 4 computes the overall dependency risk as the sum of the dependency risks of the affected nodes in the chain due to a system state realized in the source node of the dependency chain. Using the total number \( n \) of all system state sub-chains (possible attack paths) and their cumulative dependency risks, the methodology can calculate the graph’s overall risk \( G_{r} \) as the sum of the cumulative dependency risk for each nth-order dependency in the graph:

The presented method calculates the Overall attack graph risk as follows:

$$ G_{\text{r}} = \mathop \sum \limits_{i = 1}^{n} CR_{1, \ldots ,n} $$
(4)

3.3.1 Risk chains calculation

An attack graph analysis always needs to analyze all potential chains. To calculate these risk chains of an attack graph, we must first find all its simple non-cyclic paths. Finding all such paths in any graph is a costly process, considering that in a fully connected graph of order \( V \), where every node connects to every other node, there are \( \left( {\left| V \right|!} \right) \) possible paths.

As presented in Algorithm 2, in our implementation we utilized a modified DFS algorithm. The algorithm starts with any input graph node, and at each recursive call, we attempt to extend a path (save visited nodes to an array) by visiting nodes (traversing the graph) until reaching a dead end. If the visited node does not have any output edges (dead end), calculate the cumulative dependency risk and output the result for the calculated path (array) containing the visited nodes. Finally, remove the last stored node in the array and start again with the \( \left( {n - 1} \right){\text{th}} \) node. We do this until we reach all the dead ends or reach the first node. In our study, we should note that we were only interested in paths (chains) up to a predefined length, which is up to length 6. The running time complexity of our implementation is \( O\left( {\left| V \right|^{3} *\log \left( 6 \right)} \right) \) for chains of length = 6 nodes, where V is the number of nodes of the input graph.

3.4 Graph arborescences

In graph theory, the Edmonds’ algorithm is an algorithm for finding a spanning arborescence of minimum weight (sometimes called an optimum branching). A graph arborescence is the directed analog of the minimum spanning tree for directed graphs. The algorithm was proposed independently first by Yoeng-Jin Chu and Tseng-Hong Liu [41] and then by Jack Edmonds [42]. It takes a graph and a selected root node as input and creates a tree with directed edges where the root node only connects once with each node in the graph (i.e., there is exactly one directed path from the root to any other node). Only the root node has no edge directed toward it. Arborescence can either be Minimum Weight Arborescence as directed spanning trees with the minimum total weight, or Maximum Weight Arborescence, connecting the root node with all other nodes opting for the maximum possible total weight [43].

In our methodology, we produce Maximum and Minimum Weight Arborescence on automatically generated attack graphs and define the attacker’s end goal (attack result) as the root node. This way, we produce the maximum (or minimum) weighted spanning trees from the attacker’s end goal to potential attack surfaces. This modeling approach is particularly useful for understanding complex configurations and system states, and extracting the highest (or lowest) risk attack scenarios and corresponding system states.

Minimum weight arborescence algorithms are often used to find approximate solutions for complex problems such as the bottleneck Traveling Salesman [44], and network flow/reliability optimizations [45, 46].

figure b

A spanning arborescence is a directed graph (digraph) in which, for a node \( V_{i}^{'} \) called the root and any other node \( V \), there is exactly one directed path from \( V_{i}^{'} \) to \( V_{i} \). An arborescence \( T \) of a weighted directed graph \( G \) is thus the directed-graph \( G' \) form of a rooted tree such that (i) \( T \) contains every node \( V \) of graph \( G \), and (ii) T does not contain any cycle. A cycle is a graph path in which the first node corresponds to the last. A spanning arborescence of minimum weight can be perceived as the directed equivalent of the minimum spanning tree (MST) problem [47].

The problem of finding an optimum arborescence is trickier than its undirected version since for any cut \( C = \left( {Q,W} \right) \),where \( \left\{ {x,y} \right\} \in {\text{E}}\left| {x \in Q,y \in W} \right|Q \subseteq V,W \subseteq V \),of \( V \), of a graph \( G = \left( {{\text{V}}, {\text{E}}} \right) \), if there is a least-cost edge \( \left\{ {{\text{x'}},{\text{y'}}} \right\}, {\text{x'}},{\text{y'}} \in {\text{V}} \) crossing that cut, that edge may not belong to all optimum arborescence’s of \( G \), hence the cut property does not apply. A minimum weight arborescence of a weighted directed graph can be found by algorithms such as those described in Bock [48], Chu and Liu [41], Edmonds [42].

The running time of Edmonds branching algorithm is \( O\left( {\left| V \right| \left| E \right|} \right) \), where \( \left| E \right| \) is the number edges and \( \left| V \right| \) the number of nodes [47, 49]. In our implementation the branching algorithm is based on the work of [50, 51] utilizing a Fibonacci heap [52] resulting \( O\left( {\left| E \right| + \left| V \right|log\left| V \right|} \right) \) in running time [47, 49], and is constructed by design to find both maximum and minimum weight arborescence of an input graph.

We apply our implementation of Edmonds algorithm on the reduced weighted simple graph, produced from the reduction process, producing a single-node tree. The resulting graph \( G' = \left( {V,E'} \right) \) is directed and non-circular, since it is based on Edmond’s algorithm, where \( E^{'} \subseteq E \).

3.5 Closeness centrality clustering

We use Clustering to build groups of system states reached by attack paths. Grouped states have related attack patterns and often contain functionally related vulnerabilities, such as remote code executions (RCEs) for a specific system, or vulnerabilities that are commonly exploited by previous steps. Such group clusters can be a powerful tool for vulnerability prioritization and understanding of influence on an overall network.

Centrality metrics are used in network models and quantify the influence of nodes and their relative importance within a graph [53,54,55]. Nodes with high centrality values have increased influence on other graph nodes and are, thus, good candidates for implementing risk mitigation controls [33, 34]. A network asset group or cluster can be defined as a subset of nodes [56].

We use the centrality metrics technique on graphs and their arborescence to quantify the importance of each system state (i.e. a node), within the context of a cyber-attack scenario (i.e. an attack path). Affected system states (nodes) with high centrality values are able to pinpoint vulnerabilities with the highest impact in an enterprise network. Different centrality metrics capture different aspects of network topology. In this methodology, we tested and use the Closeness centrality metric for attack state analysis and clustering. Closeness centrality calculates the average shortest path between node x and any other node in the graph [57]. Closeness centrality captures the average distance between every pair of nodes in a graph and assumes that nodes only affect nodes with whom they are directly connected through graph edges.

By experimenting with all potential centrality metrics, we found that Closeness provides the most realistic results when quantifying vulnerability influences. Degree centrality is not relevant since it is useful in multipath graphs where some nodes have numerous incoming or outgoing connections (more than four). Attack graphs rarely include such connections. Also, Betweenness centrality metrics offer similar results with Closeness. In contrast, Eigenvector centralities quantify the importance of neighboring nodes, which is irrelevant to attack graphs. During attacks, adjacent system states have no direct relation to current states besides their direct dependency. These direct dependencies are modeled in our model through edge connections.

3.5.1 Cluster formation

The Closeness centrality metric with midpoint on extreme values works best when identifying high influence nodes and clusters with a satisfying number of attack states and derivations [32]. Higher Closeness values are better candidates for system state (i.e., node) cluster generators. As presented in Algorithm 3, we use the midpoint of the calculated closeness centrality values from all nodes as a decision boundary. Nodes with greater influence than or equal to the midpoint are considered high-risk and are candidates for cluster generators. Instead, nodes with less influence are marked as low-risk. Cluster generators are then used over partitioning methods as key points to divide the population or system states into groups with similarities. For each pair of nodes in the set of high-risk nodes, we identify a single acyclic path that connects them, and we remove all its edges, thus splitting the graph and creating clusters. Possible orphan assets (single asset clusters) are assigned to the nearest cluster, based on the initial graph topology. The running time of our implementation is \( O\left( {\left| V \right| + \left| E \right|} \right) \), where \( \left| E \right| \) the number edges and \( \left| V \right| \) the number of nodes of the input graph.

figure c

4 Methodology

Presented approach utilizes numerous techniques to achieve its goals. Each step of the presented methodology utilizes a distinct set of algorithms, where each one provides some insight on an IoT network under analysis and outputs information to be used as input by a following step. This process uses four fundamental building blocks:

Step 1 Attack graph modelling All potential attack paths that exist and can be exploited by adversaries are mapped onto a graph. In our experiments, we use the tools, enterprise networks and research models from previous research to automatically generate attack graphs and run our algorithm on their output [18, 19].

Step 2 Risk chains calculation We calculate the above reduced attack graph and then we compute all n-order attack paths as chains of nodes. This step outputs the cumulative dependency risk of each attack path and calculates the overall risk of all potential attack scenarios that exist (i.e. the entire network/graph risk). This step also sorts attack paths per risk and prioritizes them based on their influence on the overall enterprise network.

Step 3 Spanning arborescence creation In this step, input the reduced attack graph and create arborescence an arborescence for each one of the attackers’ end-goal. For example, if attack paths map scenarios toward two different attacker end states (e.g. steal data from database server and access files in admin server), then two different arborescence will be created. This step outputs removed edges and arborescence paths, from attack surfaces to end goals. Max arborescence full attack paths introduce the highest risk on the overall system, while Min arborescence removed derivations (edges) must be considered for mitigation or removal from system preferences.

Step 4 Clustering of system states and output For each arborescence, the algorithm pre-computes the centrality metric values for each attack state (node) and creates clusters and rankings of system states.

The input and output for each step of the algorithm are summarized in Table 1. The overall methodology flow and the dependencies between its different building blocks are depicted in Fig. 1.

Fig. 1
figure 1

Graphical representation of the overall methodology flow

Table 1 Input/output data for each step of the methodology

5 Evaluation

5.1 Tool implementation

The framework was developed as a client–server web application. Front end is implemented utilizing technologies such as HTML and JavaScript while the backend server and algorithm components are developed in Java Spring using the MySQL database.

5.2 Use case 1: multi-cloud enterprise network

To validate our methodology, we use a multi-cloud network topology consisting of two cloud infrastructures connected to the Internet through an external firewall.

5.2.1 Use case architecture

The first cloud server hosts three virtual machines, Mail server, Web server, and DNS server connected to a virtual switch. The second cloud server consists of two networks: public and private. The public network hosts two VMs; the first one hosts an SQL server; the second one hosts a NAT gateway server. The private network hosts one Admin server and three VMs (called VMs Group). Also, outside users can access the Web Server, and employees can access the SQL server through workstations inside the same LAN. Figure 2 depicts the enterprise network diagram for the use case 1 ecosystem.

Fig. 2
figure 2

Tool graphical representation of examined Netflix attack graph

Each server reflects an actual vendor-specific system and is vulnerable to a set of real-world CVE vulnerabilities. The impact of exploiting each vulnerability has been extracted from CVE (Common Vulnerability and Exposures [35] (2020). The overall attack graph is depicted in Tables 1 and 2. Figure 2 shows the attack graph, generated based on the vulnerabilities exist on the services in the second scenario. The attacker’s goal is to compromise one of the VMs in VMs group in the private network, and/or compromise the database in the public network, by obtaining root access. As seen, the attacker may traverse different ways in this attack graph.

Table 2 System states/nodes and their ID

5.2.2 Tool analysis

Tables 2 and 3 present the system states (node with their IDs) that can be reached by attack path scenarios, along with the relevant derivation steps (edges). Presented edges are only worst-case scenario edges, as results from reducing the original multigraph from [18] to produce the reduced worst-case attack graph. Figure 3 provides a visual representation of the generated reduced graph (step 1).

Table 3 Attack graph derivations and their risk
Fig. 3
figure 3

A graphical representation of examined attack graph

Table 4 depicts the worst-case attack paths detected by our tool, ranked according to their cumulative risk from the used CVE vulnerabilities.

Table 4 Top 6 derivatives dependency paths output from the risk analysis step (descending)

Figure 3 presents the attack graph as generated from the use case testbed and relevant vulnerabilities, while Fig. 4 depicts the produced arborescence from running the methodology on the reduced worst-case attack graph.

Fig. 4
figure 4

Graphical representations of the max (left) and min (right) weight arborescence attack graphs

5.2.3 Results and prioritization

Clustering results point out that there exist system states and functionally related vulnerabilities with common attack steps that can be mitigated all-in-once by applying controls to their related attack origin states. Specifically, accessing the VMGroups LICQ using user access privileges (CVE-2001-0439) seems to have the highest overall influence in all possible attack scenarios, closely followed by the attack state where adversaries have root access to the VMGroups Active Template Library (CVE-2008-0015) (see Table 5). Prioritizing these two vulnerabilities will have the highest cumulative impact on all possible attack paths.

Table 5 Top 4 nodes/system states closeness centrality values for max and min weight arborescence (descending)

System states most frequently detected in highest risk attack paths involve. By combining the clustering and arborescence results with the weighted risk ranking of attack paths, we see that that the highest risk states for the overall network that can be reached by attackers are (i) VmGroupsLICQ (User access privilege) and (ii) WebServer (User access privilege). Related vulnerabilities (CVE-2001-0439 and CVE-2009-1535) should be heavily prioritized during remediation prioritization and risk mitigation.

Vulnerability OpenSSH1 access on the NAT server has the least influence and only in the lower risk (min arborescence) attack scenarios. This, coupled by the fact that this vulnerability is detected in both fastest attack paths (with least steps from start to finish), means that this vulnerability is key in order to perform the easiest attacks possible (albeit not the most influential on the enterprise network) (see Table 6).

Table 6 Key system vulnerabilities with highest overall impact and fastest attacks in all attack scenarios

5.3 Use case 2: Netflix OSS microservice system

The second testbed we used is a combination of containers provided by Netflix. The NetlixOSS high-level architecture of the testbed used in Use case 2 experiments is depicted in Fig. 5.

5.3.1 Use case architecture

The testbed is a public repository that realizes the spring cloud ecosystem. Figure 5 shows the different components of the system and Table 7 lists the main container parts and relevant repositories.

Fig. 5
figure 5

NetflixOSS high-level architecture

Table 7 Netflix OSS microservices testbed components

We used the tool from [18] to generate the NetflixOSS attack graph needed to feed it in our system. As presented in Ibrahim et al. [18], the Netflix OSS graph has linear attack dependencies, and each node connects to a small set of nodes. Containers have limited connections and keep outgoing dependencies to a minimum. In this example, each attack path can reach its end goal through multiple intermediate steps. Also, no directed edges (attack step) lead from system states with higher privileges to states with lower privileges. Also, no duplication of nodes exists (attacks that require actions in the same asset/service), which is in line with the monotonicity property [18].

This specific use case utilizes microservices. Based on the presented architecture, each component builds as a set of services, and each service runs its processes and communicates through APIs. Each microservice may run in a virtual machine (hardware and OS visualization) or a container (only OS virtualization). Either way, the attacker’s goal is to compromise one of the available servers (virtual machines or containers) by obtaining root access (admin privilege).

Netflix’s original attack graph is a multigraph, and as such, a pair of nodes can be connected to more than one edge, resulting in an overly complex graph. On the original Netflix attack graph, each different edge between a pair of connected nodes corresponds to a mapped CVE with different weight/risk. Each node only connects through exactly one edge with the maximum potential weight/risk on the reduced attack graph. Table 8 depicts the potential system states (nodes) that can be reached from all different scenarios and attack paths of the reduced graph we created using the highest-risk CVE between each pair of connected nodes.

Table 8 System states/facts and node IDs’ association

5.3.2 Tool analysis

The attack graph is generated by [18], and modeled by our tool. The overall graph and its graph edges are not presented in detail due to size restriction. Figure 6 depicts the graphical representations of the maximum (left) and minimum (right) weight arborescence attack graph for NetFlixOSS graph (entire preliminary input graph omitted due to size). For information on attack step derivations, preconditions and underlying CVE vulnerabilities present in NetFlixOSS components, please refer to “Appendix 1: “NetflixOSS CVE and edge derivations” at the end of this document.

Fig. 6
figure 6

Graphical representations of the max (left) and min (right) weight arborescence attack graph for NetFlixOSS

Attack paths that exist on the graph have an order of equal or less than 6 (Table 9). The list depicted below sorts the top highest risk system states dependency paths according to the total cumulative risk. Top paths have all highest possible cumulative risk due to all having CVEs ranked at maximum risk (10.0).

Table 9 Top 5 derivatives dependency paths output from the risk analysis step

5.3.3 Results and prioritization

Accessing the “Eureka” and “Service a” microservices seems to have the highest overall influence in all possible attack scenarios on the NetFlixOSS, closely followed by the attack state where adversaries have user access to the Zuul microservice environment (Table 10). Prioritizing vulnerabilities that affect these system states will have the highest cumulative impact on all possible attack paths.

Table 10 Top 7 nodes/system states closeness centrality values for max and min weight arborescence (descending)

System states most frequently detected in highest risk attack paths involve turbine (User privilege) (S9) and rabbitmq (User privilege) (S19). Still, by combining the clustering and arborescence results with the weighted risk ranking of attack paths, we see that that these two may be included in all most high-risk attacks, but are not key attack steps when considering all possible attacks on NetFlixOSS.

Clustering relations show that both max and min arborescence have the same highest influential nodes, which means that these system states are indeed the highest influential states for all attack, and specifically: (i) Eureka (ADMIN), (ii) Service a (User privilege), (iii) Zuul (User privilege), and (iv) Service a (ADMIN). Their related vulnerabilities (CVE-2005-2541, CVE-2017-7376, CVE-2017-1000116, and CVE-2016-2108) should be heavily prioritized during remediation prioritization and risk mitigation.

Above analysis implies that some vulnerabilities are more important when we are trying to secure specific end services (Turbine (User privilege) (CVE-2011-2895), Rabbitmq (User privilege) (CVE-2016-2108)), while others are more important when trying to generally secure the overall network from as many attacks possible (see Table 11).

Table 11 Key system vulnerabilities with highest overall impact and highest influence in all attack scenarios

5.4 Efficiency and scalability

Analyzing a graph based on the proposed methodology and algorithms is a complex and computationally intensive issue that raises concerns about efficiency and scalability. The overall complexity from a single run is equal with the worst-case algorithm complexity. Among all algorithms used in our approach, the worst performance is attributed to risk chain calculation. Risk chain calculation requires us to detect all \( \left( {\left| V \right|!} \right) \) simple paths in an attack graph of order \( V \). Finding all possible paths is a NP-Hard problem, since there is an exponential number of simple paths. The running time complexity of the algorithm is \( O\left( {\left| V \right|^{3} *\log \left( 6 \right)} \right) \) for chains of length = 6 nodes, where V is the number of nodes.

To cope with such a computationally intensive problem, we selected the Neo4J graph database as the main building block of our tool implementation. Graph databases are storage systems optimized for graph calculations and provide index-free adjacency. They model data more effectively than relational databases, especially when relationships between elements are the driving force for data model design [58, 59]. In a graph database, every node only needs to know the nodes to which it is connected (i.e., its edges). This allows a graph database system to use graph theory to analyze the edges of a graph effectively. Graph databases scale naturally to large data sets and/or to data sets with frequently-changing or on-the-fly schema [59].

We used the Neo4J graph database (Neo4j 2020) to implement the attack graph analysis tool due to its scalability, efficiency and implemented functionality on analyzing graph models with multiple attributes. According to research, Neo4J outperforms other systems and alternative libraries in (a) load time for millions of elements as well as in (b) time required to compute the total paths and detect the shortest path of an examined graph [60, 61, 62, 34, 58, 63]. In addition, Neo4j greatly outperforms relational databases such as MySQL in traversal tests like the one needed for our risk chain calculation. Also, Neo4J was shown to achieve the best performance among other popular graph databases in most benchmarks for graphs with over 20 M nodes and an approximate mean node degree equal to 10 [64].

Table 12 shows part of the output results of a comparative analysis presented in Kolomičenko et al. [64] demonstrating low execution times for extremely large graphs, thus proving the efficiency and scalability of graph theory algorithms in Neo4j. These numbers are at least 3 times bigger, in order of magnitude, that what is needed for our experiments (see below).

Table 12 Neo4j execution times for pertinent graph theory algorithms (graph nodes have an approximate mean node degree equal to 10)

Neo4j uses property graph models, where nodes can have numerous labels as attributes and nodes and relationships can hold arbitrary properties (key-value pairs) [62, 58]. Neo4j facilitates the modeling of dependent attack path states in weighted, directed graphs and can work efficiently for up to millions of nodes without significant delays in everyday PC systems.

The tool was developed and tested on an Intel Core-i7 with 16 GB of RAM and an SSD. The number of nodes and edges greatly affects the execution time of any algorithm that analyzes graph models. Even though attack graphs and attack multigraphs differ on how the model system states and steps, we still believe that comparing scalability performance is feasible, since all graphs utilize similar concepts. Table 13 shows the execution times for both use cases, and the corresponding times for each methodology step.

Table 13 Methodology steps execution times

6 Conclusions

This paper presented a method for automatic analysis of generated attacks using a set of different mathematical models and algorithms. Experiments included two real-world microservices environments that realize complex, modern enterprise networks with seemingly disjoint configurations and dependencies.

To our knowledge, current literature on attack graphs mostly focuses on the automatic generation and scalability issues, rather in analyzing complex information within them and trying to extract useful information for security control prioritization and proper risk mitigation.

The proposed framework utilizes graph modeling algorithms such as mathematical series analysis, clustering and optimization to extract information about the effect of vulnerabilities and attack steps on enterprise networks. We conceptualize attack scenarios and extract the impact of each step from all possible attacks on the system. To this end, we have extended previous automatic generation methodologies with two features: Prioritizing detected vulnerabilities and analyze the effect of system states to the overall network for proposing which system states, vulnerabilities, and configurations have the biggest overall risk to the ecosystem. Our approach takes into consideration every potential sub-attack path and subliminal path on an attack graph.

Preliminary tests on actual microservices infrastructures and multi-cloud environments show that the presented approach looks efficient and trustworthy for attack graphs of several thousands of interconnections.

Finally, in order to validate the analysis and prioritization results we compared the output results with the corresponding practices and procedures that can actually be applied to the infrastructure and relevant networks, as they emerged from the manual analysis by the industrial cyber security experts. The comparison verified the accuracy of the analysis and the results for the two demonstrated Use Cases.

6.1 Restrictions

The presented approach should meet a number of requirements to be effective. First, information dissemination between asset owners, security officers and consultants be of utmost importance.

A white-box approach is needed to map attack paths, which requires considerable employee time. Also, it takes for granted that a risk assessment or a business impact assessment exists, from which to draw information on asset risk and potential consequences from security incidents. Without prior work, this approach will not have the necessary input to produce any results.

Finally, the attack path analyzer provides a comprehensive set of attack paths and proposes possible minimizations, but does not output an exhaustive set of every possible combination of attacks and potential mitigation schemas. Instead, it focuses only on worst-case scenarios that exploit the worst CVE available between every service or microservice connection and dependency. In other words, detections of high-risk attack paths and high-risk connection removal proposals are always true positives in terms of high risk. Still, these findings are not necessarily the result of exhaustive state analysis.