1 Introduction

Attack trees are one of the most popular graphical models for information security assessment. Proposed originally by Bruce Schneier in 1999 [57], they are intuitive and relatively easy to master, yet they enjoy well-studied formalizations and quantitative analysis means [4, 5, 31, 39]. Security risk assessment at industry has long appreciated attack trees as a means to solve cognitive scalability issues related to securing large systems [46], and as a tool to enable communication among different stakeholders and facilitate brainstorming [15, 59].

From the graphical perspective an attack tree resembles a mind map [11, 59]: a powerful cognitive tool used often in Psychology and Education. It has a single root node, representing the main attacker’s goal, which is subsequently refined in sub-goals captured by child nodes. But this is where the analogy ends, as attack trees have a precise mathematical interpretation based on well-defined refinement operators [39]. The most frequently used refinement operators are the conjunctive (AND) and disjunctive (OR) refinements. The AND-refinement sets that all child nodes need to be performed to achieve the parent node, while the OR-refinement states that if the attacker can achieve any of the child nodes, then the parent node will also be achieved [39]. The leaf nodes, i.e., the nodes that do not have any children, represent atomic or self-evident attack steps.

Attack trees are traditionally regarded as both a formal framework and a tool to communicate security risks, but these two aspects are rarely considered together in the literature. For example, research in attack trees strongly focuses on improving expressiveness of the formalism through new refinement operators (e.g., sequential AND [3, 26]), proposing new flavors of attack trees (e.g., attack-defense trees [31] or attack-countermeasure trees [55]), and developing various quantitative analysis methods with attack trees [7, 8, 19, 28, 39]. Quantitative analysis techniques give rise to optimization problems, such as cost-effective countermeasure selection [4, 18, 55] and ranking of attacks [19]. In parallel, new attack tree semantics are being developed that translate attack trees into other formalisms, such as timed automata [17, 32], stochastic games [5] and Markov chains [25], which offer advanced computational capabilities.

Mostly separately from those efforts, researchers have looked into integration of attack trees with security risk management methodologies [18, 47, 49] and applying attack trees to model threat scenarios in a large variety of domains, e.g., SCADA systems [41, 63], RFID systems [7], and ATM systems [15].

More recently, researchers have started to look into the questions emerging when applying attack trees in industry. Indeed, practical applications of attack trees pose many challenges, including noticeable time investment into the tree design, significant cognitive burden on the analyst when dealing with large trees, and multitude of possible interpretations for even the most basic refinement operators AND and OR that may lead to conceptual misalignments within a single tree created and read by a group of people. This is why attack trees are sometimes mentioned in security threat assessment guidelines as an “advanced” and “alternative” model [40, 44, 59]. In this position paper we overview the emerging research directions for practical applications of attack trees and identify the gaps that are still to be investigated by the research community interested in attack trees. We argue that there are many exciting research problems that can contribute to better acceptance of attack trees in practice and to a better synergy between the academic and industry worlds.

We start by reviewing some practical challenges with attack trees and lessons we learned while working with attack trees (Sect. 2). Then we overview the emerging research directions in attack trees that focus on improving the acceptance of the formalism in practice (Sect. 3). Subsequently, we propose additional research themes that are currently not so active, but will be appreciated by practitioners (Sect. 4).

2 Challenges with Attack Tree Design in Practice

Attack trees allow to structure quite diverse threat scenarios (e.g., attacks occurring at physical, digital or social levels, or a combination of those) and to reason about these scenarios at different levels of abstraction. Yet, this strength comes at the cost of substantial time and effort investment into the design of a comprehensive tree. Like any other security assessment methodology, a thorough attack tree model requires a diverse set of skills from its authors. Typically, an attack tree design exercise requires domain expertise (i.e., stakeholders providing in-depth knowledge of the system) and security expertise (persons providing security knowledge and experience in the methodology). For example, the attack-defense tree for the relatively small ATM case study reported in [15] required a team of two domain experts and two security analysts working jointly for 6 days. Furthermore, one of the stakeholders has spent 10 days before the case study meeting on preparing the documentation, reviewing ATM crime reports, outlining the scenario, and collecting statistical data for quantitative analysis. This amount of effort can be prohibitive for small organizations.

A delaying factor in designing attack trees is that, although they may seem to be quite simple and intuitive [57,58,59], there is still a space for misconceptions and a multitude of interpretations about the meaning of refinement operators or tree semantics. These discrepancies are often neglected in the literature, but they become apparent when a group of people with diverse backgrounds starts working together on an attack tree. For example, in case of attack-defense trees applied in [15], the meaning of defense nodes was problematic. For the stakeholders, the defense nodes represented anything that is a security control, irrespectively of its type (preventive, detective or reactive). However, the attack-defense tree formalism provides a uniform meaning to defense nodes through semantics and attribute domain specification [31], and it does not allow to specify explicitly different countermeasure types.

Attribute domains can introduce confusion even for pure attack trees without countermeasure nodes. For example, the traditional AND operator specifies that all children nodes need to be fulfilled to fulfil the parent node. Its interpretation in the minimal attack time attribute domain depends whether the children node actions can be done in parallel or they are sequential. The ADTool software, for example, includes both options, and so the practitioners have to choose which one to work with [30]. This means, they need to be aware of this choice, and they need to interpret it correctly and consistently for the whole tree.

In fact, attack tree guidelines existing today in the literature are quite vague, and they usually operate within the top-down approach. The team needs to start with the top node representing the attacker’s main goal, that can be refined into subgoals and more concrete steps until very precise attack actions are found [1, 39, 57, 59, 62]. The guidelines do not specify what is the best way to structure the tree, how to deal with repeating nodes, how to label the nodes in the best way, or how to arrange the work on the attack tree so that everybody has the same understanding of the attack tree elements meaning. This means that, in the best case, these choices are strategically made by the most experienced team members, who, however, do not share them with the wider community, or they are made ad-hoc or even post-hoc. In the worst case, these aspects are not agreed upon at all, and, therefore, the resulting tree can be inconsistent. Furthermore, this tree will likely be less comprehensible, due to the absence of empirically founded best practices in tree structure and comprehensible tree design.

Absence of errors is another big concern for practitioners when designing attack trees [57, 59]. These errors can be on both sides, and, therefore, optimally, the designed tree should be both complete (no attacks are omitted) and sound (does not contain attacks that do not exist in the actual system). Practitioners can apply some tree validation techniques to ensure this. For example, in the ATM case study semantics-based validation (checking that attack bundles in the multiset semantics [31] represented meaningful attacks), data-based validation (investigating any discrepancies between the expected attribute value and the value computed in the quantitative analysis), and catalogue-based validation (ensuring that all attacks collected by an industry specific catalogue are captured) were applied [15]. However, these validation techniques are limited when applied by human analysts, because it is impractical to check by hand all possible attack bundles or data value discrepancies.

Certainly the attack tree construction process is an excellent opportunity for brainstorming about potential security threats and cost-effective countermeasures. But its main value comes from post-analysis and subsequent communication with other stakeholders. We observe that in practice analysis and comprehensibility of attack trees are in conflict. On the one hand, fine-grained analysis benefits from large trees describing all attacks on a concrete system. But on the other hand, large models can strain analysts’ cognitive capabilities, and the practitioners may find it difficult to comprehend all described attack scenarios, especially after a certain time.

Acquisition of input data is a challenge by itself in risk assessment methodologies [67], and attack trees magnify it due to a large number of leaf nodes [7, 45]. The standard approach for attack trees, when only leaf nodes are annotated with values can be too restrictive, as often data for intermediate nodes can be more readily available than data for leaves. This observation is further reinforced if we consider the costs of data collection in an organization (in terms of effort, time, etc.). Sometimes, more generic data than expected is available, e.g. from historical databases, multiple surveys and empirical results [15]. In this case, correlation and normalization of data to fit the attack tree methodology can become a challenge.

We notice that there exists a conceptual mismatch between research on attack trees and practical applications of attack trees. As we mentioned, in practice, attack trees are constructed by following the top-down approach. Yet, academic papers on attack trees define semantics and quantitative analysis techniques for these models via the leaf nodes, i.e., the lowest-level events (bottom-up approach). This makes it hard to implement a consistent feedback loop between design and analysis.

3 Research Trends in Attack Tree Applications

Given the challenges summarized in Sect. 2, several promising research directions have emerged recently to address the needs of practitioners.

Attack tree generation. Manual design is the state-of-practice for attack trees [56, 59]. However, this exercise is time-consuming and error-prone. Automated tree generation techniques have emerged very recently, and there are few approaches reported in the literature yet [16, 20, 24, 50, 66]. These works provide means to generate attack trees from some system model, under assumption that it is easier for the team to design a good system model than a good attack tree. However, there is still some way to go before generated attack trees can be used in the security risk assessment practice. First of all, the techniques reported in [24, 66] generate refinement-unaware trees, i.e., trees that do not support the user in understanding the various levels of abstraction. In tradition with the propositional semantics of attack trees [39], but in contrast with the expectations of a security analyst, [66] interprets each intermediate node only as a combination of children nodes. The techniques [16, 24] offer trees with meaningful intermediate nodes, yet still lacking a proper refinement structure, when more abstract subgoals are refined into more precise attack steps.

The earliest approach capable of generating refinement-aware trees is ATSyRA [50, 51]. It extracts the refinement structure from a hierarchy of actions in the model defined by the expert. More recently, Gadyatskaya et al. [20] showed that both the semantical domain and refinement structure of a tree can be obtained from a system model; without the need of expert intervention. Although the approach is promising, as it allows for fully automatic attack tree generation, it still cannot produce proper labels for the refinement structure.

Other attack tree generation approaches work with established security catalogues and knowledge bases, and attempt to construct attack trees from them. Knowledge bases and catalogues that systematize information on attacks, vulnerabilities and countermeasures are a trusted source of information for security risk practitioners, and many security risk management techniques include one or more catalogues [2, 9, 42]. Suggestions to apply established catalogues of threats to facilitate manual tree design have been voiced in [15, 59]. Furthermore, for some knowledge bases, it is straightforward to transform certain attack scenarios described in those into attack trees. Then an analyst can manually produce more complex threat scenarios from these attack trees [21, 64]. Techniques to automate attack tree generation using security catalogues and libraries have been reported in [46, 52]. The TREsPASS project [53, 65] has applied a security knowledge base to attack trees generated from a system model in order to refine leaf nodes of particular types, mainly for precise attacks on human agents and processes, such as social engineering or hacking.

Attack tree generation allows to reason on formal properties of obtained models. Indeed, for manually designed trees, it is understood that these models are as complete as the knowledge and experience of experts who designed them [57, 59]. When attack trees are produced from an underlying system model or a knowledge base, it is possible to define the notion of completeness with respect to the model, and one can investigate whether an approach generates complete trees. For example, completeness with respect to a knowledge base is established as a desired property in [46], and completeness of a generated tree with respect to a system model is established in [20, 24]. Completeness is especially critical for risk managers and security consultants, as they want to be reassured that no important attacks are missed.

Another interesting property of generated attack trees is soundness with respect to a system model, i.e., whether all attack scenarios captured by a tree are valid attacks in the model. Audinot and Pinchinat defined the soundness property for generated attack trees [6]. Soundness is critical for generating refinement-aware trees. Indeed, refinement establishes how abstract actions can be represented as combinations of more precise ones. Yet, not all combinations of precise actions can result in a valid attack in the system model.

Attack tree visualization. The TREsPASS project has proposed means for visualizing large attack trees [22, 37, 48]. This visualisation portfolio strives to hide away complexity of the tree by removing the node labels, arranging the tree circularly, linearizing complex attack scenarios, and supporting zooming-in and out (at the visualization level). These methods lead to reduction of the cognitive effort needed to process a complex tree, yet they contrast with the traditional manually designed attack trees, where meaningful node labels are essential, trees are arranged vertically to allow label readability, and non-linearism of attack scenarios gives an opportunity to reason about complex attacks [58, 59].

Empirical studies with attack trees. To the best of our knowledge, Opdahl and Sindre [43] and Karpati et al. [27] have been the only ones reporting on empirical studies with attack trees. These studies compared attack trees with misuse cases in the context of threat assessment, and they have reported that attack trees allowed the participants to find more threats.

4 Next Steps and Conclusions

Comparing the challenges enumerated in Sect. 2 and research results summarized in Sect. 3, we can see that some challenges are addressed by an ongoing or past research effort. Indeed, generation techniques strive to reduce the time and effort required to produce attack trees, and to provide a framework for guaranteeing absence of errors in the obtained model. Visualization approaches are helping the analysts to better comprehend attack trees and to improve the cognitive scalability of the method. Yet, these results can still be strengthened and extended towards more user-friendly models.

In particular, the generation techniques can be improved by working on the refinement-awareness for the produced models. To achieve this, we propose to establish a new refinement-aware semantics for attack trees that will allow to assign meaning to intermediate nodes independently of their children nodes. The generated trees will need to be correct with respect to this semantics. Refinement relationship can be either defined by an expert as in [50], or it can be extracted from the system model itself [20] or from an appropriate knowledge base. Furthermore, the generated trees can be transformed into semantically-equivalent forms that have less nodes [29, 39], what could potentially improve the comprehensibility of these smaller trees.

Comprehensibility and readability of graphs and the limits of human cognitive capabilities while reading and analyzing graphical data have been explored in, e.g., [13, 54]. Information visualization challenges related to usability and scalability were highlighted in [12]. It will be interesting to see the findings of these works applied in the attack tree domain.

In the security risk assessment area, comprehensibility studies of visual and textual security risk models were reported in, e.g., [23, 35, 38]. A classification of scenarios for empirical studies in information visualization was proposed by Lam et al. [36], and visualization evaluation for cyber security was discussed by Staheli et al. [61]. To the best of our knowledge, there have been yet no empirical studies of attack tree comprehensibility, and this could be a promising research direction. Indeed, outside the attack trees topic, there is a rich empirical research literature on security modeling and assessment [10, 33, 34], software engineering [68], and requirements engineering [60]. This literature can be used by the attack trees community to build upon.

The challenges acquisition of input data, absence of empirically grounded best practices, the trade-off between analysis and comprehensibility, and the conceptual mismatch between the top-down manual tree design process and the bottom-up formal semantics have not yet been addressed in the attack trees community.

The data issues for quantitative analysis is a complex problem, because the quality and quantity of available data strongly depend on the application domain. In the quantitative risk analysis domain data-related challenges are known, and there exist methodologies for validating the data [67]. The attack tree community may thus strive to devise new methodologies for data validation and data-based tree validation.

We observe that the tension between detailed analysis, which requires large-scale trees, and comprehensibility, which tends to drop with the size of the tree, can be mitigated by means of model transformation techniques. Model transformation is fundamental in Computer Science and key in Model-driven software development [14], as it provides models at different levels of abstraction in a synchronized way. In that regard, attack tree generation can be seen as a model transformation approach; from a system model to an attack tree model. It would be interesting to see other types of transformations, e.g., from an attack tree to an attack tree, that could yield more condensed yet human-readable trees.

We argue that the misconceptions and multitude of interpretations of attack trees can be addressed by establishing a more rigorous methodology for practical application of attack trees that will include an initial phase when interpretations of the tree semantics and refinement operators are agreed upon. This methodology needs to be grounded in empirical studies with attack tree practitioners, in which they could report on what are the most frequent communication pitfalls they face, and how do they interpret different attack tree-related aspects, such as operators, semantics, etc.

Overall, we can conclude that the attack tree research community has made a substantial progress in developing the formal framework underpinning the model. We as researchers have a huge choice of attack tree semantics, quantitative analysis techniques, software tools, and means to apply attack trees in security assessment case studies. It is also exciting to see that the research community has started to focus on the practical needs of security analysts working with manually designed attack trees in organizational threat modeling and security risk management. We believe that this synergy between research and industry can further enhance the attack tree formalism and it will open new horizons in the attack tree research.