Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Secure HTTPS connections rely on the users’ browsers to obtain authentic domain-to-key bindings during set-up. With this in mind, trusted third parties called certificate authorities are used to vouch for the integrity of public keys by issuing X.509 certificates. Though the initial problem of establishing trust might appear to be solved, several new complications arise. Considering that there are hundreds of certificate authorities, all of which are capable of issuing certificates for any domain, it is challenging to concisely observe what has been issued for whom [11]. As such, a misissued or maliciously issued certificate could remain unnoticed forever, or more likely until an attack against a domain has taken place. Naturally this raises an important question: who watches the watchmen?

Google’s Certificate Transparency (CT) project proposes public logs based on append-only Merkle trees [18]. The basic idea is that an SSL/TLS certificate must be included in some log to be trusted by a browser, and because the infrastructure is public anyone can audit or monitor these logs to ensure correct behavior [6, 16]. Thus, CT allows clients to determine whether a certificate was valid at some point in time, but inclusion in the log cannot guarantee that it is current. For instance, what if a certificate has to be revoked due to a compromised private key or an entire certificate authority [15, 29]? Since the log is both chronological and append-only, effected certificates can neither be removed nor can the absence of a revocation certificate be proven efficiently [12].

Certificate Revocation (RT) is a proposed extension to CT by Laurie and Kasper [17]. The aim is to provide a separate mechanism that proves certificates unrevoked, and requires an authenticated data structure supporting efficient non-membership proofs [34]. As is, there are at least two approaches towards such proofs. One is based on sorted Merkle trees, and the other on tuple-based signed statements on the form “Key \(k_{i}\) has the value \(v_{i}\); there are no keys in the interval \((k_{i}, k_{i+1})\)” [9, 17]. We consider the former approach in terms of a sparse Merkle tree (SMT), whose scope goes far beyond RT. For example, an SMT can be used as a key building block in a wide area of applications, ranging from persistent authenticated dictionaries to secure messaging applications [10, 20, 30, 32].

After introducing some necessary preliminaries (Sect. 2) and the approach taken here (Sect. 3), our contributions are as follows. First, building on an interesting proposal started by Laurie and Kasper [17], we define efficient caching strategies and complete recursive definitions of an SMT (Sect. 4). Second, we evaluate the security of our definitions in the multi-instance setting, comparing our design decisions with those made in CONIKS [20] (Sect. 5). Third, we examine three caching strategies experimentally for an SMT, showing different space-time trade-offs (Sect. 6). Finally, we discuss related work (Sect. 7) and end with conclusions (Sect. 8).

2 Preliminaries

We start by describing background regarding Merkle trees and audit paths, then cryptographic assumptions that our security evaluation relies on are presented.

2.1 Merkle Trees

A Merkle tree [21] is a binary tree that incorporates the use of cryptographic hash functions. One or many attributes are inserted into the leaves, and every node derives a digest which is recursively dependent on all attributes in its subtree. That is, leaves compute the hash of their own attributes, and parents derive the hash of their children’s digests concatenated left-to-right. As further described in Sect. 5, certain digests must also be encoded with additional constants. This is to prevent indistinguishability between different types of nodes [8, 20].

Figure1 illustrates a Merkle tree without a proper encoding. It contains eight attributes \(\rho {}\)\(\omega {}\), and the root digest \(r{} \leftarrow {} d{}^{3}_{0}\) serves as a reference to prove membership by presenting an audit path [18]. For instance, dashed nodes are necessary to authenticate the third left-most leaf containing attribute \(\tau {}\). More generally, an audit path comprises all siblings along the path down to the leaf being authenticated. Combined with a retrieved attribute, this forms a proof of membership which is valid if it reconstructs the root digest \(r{}'\) such that \(r{}' = r{}\). Note that a proof is only as convincing as \(r{}\), but trust can be established using, e.g., digital signatures or by periodically publishing roots in a newspaper.

Fig. 1.
figure 1

A Merkle tree containing attributes \(\rho {}\)\(\omega {}\). The digest rooted at height \(h{}\) and index \(i{}\) is denoted by \(d{}^{h{}}_{i{}}\).

2.2 Setting and Cryptographic Assumptions

Inspired by Katz [13] and Melara et al. [20], we consider a computationally bounded adversary in the multi-instance setting. This means that there are many distinct SMTs, and the adversary should not gain any advantage in terms of necessary computation if she attempts to attack all SMTs at once. In other words, despite the adversary’s multi-instance advantage, the goal is to provide full \(\lambda \)-bit security for each SMT. For security we rely on a collision and pre-image resistant hash function \(\mathsf {H}{}\) with digests of size bits, and on Lemma 1.

Lemma 1

The security of an audit path reduces to the collision resistance of the underlying hash function \(\mathsf {H}{}\).

Proof

This follows directly from the work of Merkle [21] and Blum et al. [5].

3 Sparse Merkle Trees

First we introduce non-membership proofs that are based on sorted Merkle trees, then the notion of an SMT and our approach is incrementally described.

3.1 Non-Membership Proofs and High-Level Properties

In RT and like applications, it is crucial to prove certain values absent [17, 31, 32]. Efficient construction of such non-membership proofs can be enabled by viewing balanced binary search trees, e.g., treaps and red-black trees, as Merkle trees. A lexicographically sorted tree structure serves the purpose of preventing all nodes from being enumerated, involving rules that rotate nodes upon insertion and removal, and the structure of that tree can be fixed by a trustworthy root due to being a Merkle tree. We prove non-membership by generating an audit path through binary search, and a verifying party accepts the proof to be valid if there is no evidence that the tree structure is unsorted or that the root is improperly reconstructed. In other words, the absence of a value is efficiently proven due to a balanced search tree, and the proofs are convincing because the structure of the tree is fixed by a cryptographically derived root.

While an SMT also relies on the structure of the tree together with being a Merkle tree, it is different in that it requires neither balancing techniques nor certain constants when encoding digests. This is due to an intractably large Merkle tree that reserves a unique leaf \(\ell {}{}\) for every conceivable key digest. The hash of a key \(k_{}\) determines \(\ell {}{}\), and \(k_{}\) is a (non-)member if the attribute \(a_{} \in \ell {}{}\) is set to \(a_{0}\) and \(a_{1}\), respectively. Hence, the resulting tree structure contains \(2^{N{}}\) leaves at all times, and (non-)membership can be proven by presenting an audit path for leaf \(\mathsf {H}{}(k_{})\). This set-up also implies history independence [25]: a unique set of keys produce a deterministic root digest, regardless of the order in which keys have been inserted or removed. Notably history independence is not necessarily provided by a sorted Merkle tree (e.g., not the case for a red-black tree).

3.2 Tractable Representations

Considering the intractable size of an SMT, it is not without challenges to define an efficient representation. To begin with, the only reason why this is feasible traces back to the key observation that an SMT is sparse. This means that the vast majority of all leaves represent non-members, as indicated by a shared attribute \(a_{0}\), resulting in a construction where the empty subtrees rooted at height \(h{}\) derive identical default digests. The basic principle is as follows. An empty leaf computes \(d{}^{0}_{*{}} \leftarrow {} \mathsf {H}{}(a_{0})\), a node rooted at an empty subtree with height one derives \(d{}^{1}_{*{}} \leftarrow {} \mathsf {H}{}( d{}^{0}_{*{}} \Vert {} d{}^{0}_{*{}} )\), and so forth. Since these default digests can be precomputed, they need neither be associated with explicit nodes nor be derived recursively by visiting all leaves. Instead, referring to Fig. 2, it suffices to process the filled nodes whose digests depend on existing keys.

Fig. 2.
figure 2

An illustration of how subtrees with default digest can be discarded to attain a tractable representation of an SMT.

3.3 Earlier Proposals

Different approaches can be used to provide efficient representations of an SMT. Bauer [3] has proposed an explicit pruned tree structure where all the non-empty attributes are elevated upwards through their ancestors. The elevation stops when the root of a subtree containing a single non-empty leaf is reached, and all descendants to such roots are discarded. The original SMT can be reconstructed by recording indices for the non-empty leaves in each subtree, but will require excessive amounts of memory unless they are evenly spread out. Hence, while the proposal is neat, we find the approach started by Laurie and Kasper [17] more generally applicable. It is based on maintaining a collection of keys \(\mathcal {K}{}\), and the collection is authenticated by simulating an SMT. As is, however, their proposal is incomplete and cannot, e.g., derive (non-)membership proofs efficiently. This is due to deriving subtrees’ digests over and over again—an issue we solve in the following sections by introducing relative information.

3.4 Our Approach

We approach the SMT in terms of a simulation (Definition 1). Let us start by considering the simplest case of no relative information, then why it is necessary.

Definition 1

A simulated SMT is the composition of (i) a data structure \(\mathcal {D}{}\) containing unique keys \(k_{}\), and (ii) a collection of cached digests, referred to as the relative information . Both structures define operations for insertion, removal, and look-up; \(\mathcal {D}{}\) also supports splitting, i.e., dividing it in two based on a key.

Our SMT is simulated in the sense that there is no explicit tree structure, which is possible because every \(k_{} \in {} \mathcal {D}{}\) can be mapped to its associated subtrees recursively. For example, as shown in Fig. 3, a root digest can be obtained by simulating a traversal from the root down to all the non-empty leaves. The base is initially set to all zeros and refers to the left-most leaf in a subtree. It remains the same on left-traversals, must be updated by setting the appropriate bit to one on right traversals, and is used to determine the split index. The split index is the key upon which \(\mathcal {D}{}\) is divided on and refers to the left-most leaf in the right subtree. Thus, as formalized in Sect. 4.3, it is an upper exclusive and lower inclusive bound for the keys in the left and right subtrees, respectively.

Fig. 3.
figure 3

An illustration of a recursive traversal to obtain the root digest; \(k_{1} = 000\), \(k_{2} = 010\), and \(k_{3} = 111\).

Clearly, it is inefficient to obtain a subtree’s digest by repeatedly visiting all the non-empty leaves. Therefore relative information is necessary: a collection of cached digests with the sole purpose of preventing such inefficiency. For instance, a naïve caching strategy could record every digest that is non-default. Although that requires excessive amounts of memory, it would ensure that all siblings’ digests are available upon generating audit paths. Consequently, the number of splits will be constant, and (non-)membership can be proven with the same time complexity as the underlying split operation. Our aim when defining caching strategies is to preserve this property while reducing memory requirements.

4 Efficient Representations

First we define caching strategies that are based on capturing branches, then our proposal is formalized by presenting complete recurrences for an efficient SMT.

Definition 2

A branch is an interior node in a Merkle tree, for which both of the two children derive non-default digests [27].

4.1 Caching Strategies

During the design of a caching strategy it is important to consider expected and worst case scenarios. The former is somewhat straight forward since the output of \(\mathsf {H}{}\) is uniformly distributed, whereas the latter is both strategy and use-case dependent. That is, the non-empty leaves will be evenly spread out in the average case, and a cluster of non-default digests will therefore be formed at the higher \(\left\lceil \log {n}\right\rceil +1\) layers. If these digests are captured by the relative information, the traversals down to the leaves can be prevented. The digests rooted at layers below the dense threshold are of lesser importance due to the sparse property, but can be vital if a worst-case ever occurs. For example, an intuitive caching strategy that we omit is to record the higher \(\left\lceil \log {n}\right\rceil +1\) layers of the SMT. Although the dense part would be captured in the average case, forcing leaves to clump at some subtree is trivial for an adversary that selects the keys. Hence, a large majority of the non-default digests cannot be captured, and the resulting cache will be useless if (non-)membership proofs are issued for the clumped subtree. This is the reason why our caching strategies evolve around capturing branches (Definition 2), aiming to bound the number of recursive traversals down to the leaves by a constant. As desired, it then follows that the time necessary to generate an audit path, or equivalently the time necessary to update the status of a single key, will reduce to the underlying split operation.

B Cache. Figure 4a depicts the \(\text {B}{}\) cache which captures every digest rooted at a branch. It contains \(n-1\) digests at all times, and requires at most \(N{}\) traversals down to either a branch or leaf upon generating audit paths. The former follows from the observation that all but the first insertion yield a single branch, and the latter (i.e., the worst case) is discussed in Sect. 5.3.

Fig. 4.
figure 4

Captured digests as the circled subtrees contain a single non-empty leaf.

B \(^{-}\) Cache. By discarding \(\text {f}(n){}\) branches from the \(\text {B}{}\) cache, memory requirements can be reduced at the cost of additional computation. This forms the notion of \(\text {B}{}^{-}_{}\), which provides trade-offs depending on how \(\text {f}(n){}\) is implemented. We examine a probabilistic approach where a branch is captured with probability p, meaning \(\text {f}(n){}\) is roughly \(n(1-p)\). Other variations of \(\text {f}(n){}\) include ignoring every other layer, as well as defining an upper bound for how many branches to ignore.

B \(^\mathbf{+ }\) Cache. The drawback of using a \(\text {B}{}\) cache is that, in the average case, only the higher \(\left\lceil \log {n}\right\rceil \) layers will be captured. In other words, since the dense part also spans layer \(\left\lceil \log {n}\right\rceil +1\), we are missing out on some performance. \(\text {B}{}^{\text {+}}{}\) aims to solve this issue by capturing branches together with their children. The resulting cache covers the entire dense part of the SMT, but for the sake of efficiency we also limit the worst case memory requirements by 2n due to discarding branches (Fig. 4b). The difference is negligible with regard to time, considering that a branch can derive its digest in constant time from the cached children.

4.2 The Cache Routine

Implementation-wise our caching strategies are convenient. To process an interior digest, a cache function that accepts the left and right child digests can be used. Upon invocation it computes the interior digest \(d{}^{}_{}\), examines if both children are non-default, deletes the previous branch if applicable, caches in case of a new branch, and outputs \(d{}^{}_{}\). While this algorithm merely concerns the \(\text {B}{}\) cache, it extends perfectly to \(\text {B}{}^{-}_{}\) and \(\text {B}{}^{\text {+}}{}\). Therefore these caching strategies are practical to mix: start off with \(\text {B}{}^{\text {+}}{}\), switch to \(\text {B}{}\) as memory requirements grow larger, and finally migrate to \(\text {B}{}^{-}_{}\) with shrinking probability p. For instance, this could be interesting in real-world scenarios where memory is a limited resource.

4.3 Recurrences

Let \(h{}\) be the height of a subtree, \(b{}\) the base of a node, and \(\mathcal {D}{}\) a data structure containing unique keys’ digests \(\mathsf {H}{}(k_{})\). Further denote by \(\alpha {}_{i}\) the \(i^{th} \ge {} 0\) left-most bit in \(\alpha {}\), \(\alpha {}_{i=\beta {}}\) the assignment of that bit to \(\beta {} \in {} \{0,1\}\), and by colon (:) list concatenation. Finally, define the bit in the base that is set on right traversalsFootnote 1 as , the split index as , and \(\mathcal {D}{}\) divided on \(s{}\) for relation \(R{}\) as . Our recurrences are shown in Fig. 5:

  • Given a height \(h{}\), (1) derives the default digest \(d{}^{h{}}_{*{}}\). The leaf hash (\(\texttt {LH}{}\)) and interior hash (\(\texttt {IH}{}\)) functions serve the purpose of encoding digests securely, as further described in Sect. 5.

  • Given a height \(h{}\), a base \(b{}\), and a collection of keys \(\mathcal {D}{}\), (2) derives the digest \(d{}^{h{}}_{b{}}\). The base case occurs if there is relative information available, if a default digest is applicable, or if a non-empty leaf is reached. Otherwise, (2) performs two recursive calls with \(\mathcal {D}{}\) divided on \(s{}\), \(b{}\) updated in the event of a right traversal, and \(h{}\) reduced by one.

  • Given a height \(h{}\), a base \(b{}\), a collection of keys \(\mathcal {D}{}\), and a key \(k_{}\) for leaf \(\ell {}{}\) , (3) generates an audit path for \(\ell {}{}\). Note that the siblings’ digests are gathered by list concatenation, repeatedly invoking (2) after reaching \(\ell {}{}\).

  • Given a height \(h{}\), a base \(b{}\), an audit path \(P\) for key \(k_{}\), and an attribute \(a_{} \in {} \{{a_{0}, a_{1}}\}\) , (4) reconstructs the root digest by traversing the tree down to the leaf being authenticated. Every sibling’s digest is obtained from \(P[j{}]\).

  • Given a height \(h{}\), a base \(b{}\), a collection of keys \(\mathcal {D}{}\), a subset of keys \(\mathcal {K}{} \subset {} \mathcal {D}{}\) where \(\mathcal {K}{} \ne {} \emptyset {}{}\), and an attribute \(a_{} \in {} \{{a_{0}, a_{1}}\}\) , (5) outputs the new root digest and updates the relative information. This is achieved by visiting all leaves \(\ell {}{} \in {} \mathcal {K}{}\), also invoking the cache function (\(\texttt {C}{}\)) to compute the interior digest \(d{}^{h{}}_{b{}}\) and ensure that the relative information is up-to-date.

Fig. 5.
figure 5

Recurrences that derive default digests (\(\xi {}{}\)), root digests (\(\texttt {R}{}\)), audit paths (\(\texttt {A}{}\)), reconstructed root digests (\(\texttt {B}{}\)), and relative information (\(\texttt {U}{}\)).

The size of an audit path is \({\mathcal {O}}(1)\), but can be further reduced by discarding default digests. This yields a sparse audit path, and necessitates encoding of an \(N{}\)-bitmap to determine whether a digest is (non-)default. We omit the details of such a recurrence since it is trivially added when (3)–(4) is provided.

5 Security

Consider a single SMT and assume that the hash function is fixed. Then it follows that the size of an audit path is fixed by \(N{}\) due to the structure of the tree, and consequently we can distinguish between leaves and interior nodes. This means that, for the case of a single SMT with a fixed hash function, no special encoding is necessary to distinguish between nodes, and that the security of an audit path reduces to the collision resistance of the underlying hash function (Lemma 1).

Next, to prevent an adversary from gaining any advantage when attacking several SMTs in parallel, we consider the full (concrete) security of an audit path in the multi-instance setting. Thereafter we relate our encoding of nodes to CONIKS [20], and examine the impact of caching strategies for security.

5.1 The Merkle Prefix Tree in CONIKS

As described more broadly in Sect. 7, CONIKS is a key verification service that uses a Merkle prefix tree (MPT) to authenticate the users’ key bindings [20]. An MPT can be seen as a dynamically sized and explicit SMT where empty subtrees are replaced with empty nodes. Key-bindings are mapped by a hash function \(\mathsf {H}{}\) to unique indices i, and every (non-)empty leaf in the tree is associated with a depth \(\ell \) as well as an \(\ell \)-bit unique prefix j of i. The encoding of an empty node is defined in (6).

(6)

\(C_{\mathrm{empty}}\) is a constant for empty leaves and \(C_{\mathrm{tw}}\) a tree-wide constant. The encoding of a non-empty node is defined in (7).

(7)

\(C_{\mathrm{leaf}}\) is a constant for non-empty leaves and p a payload. Finally, the encoding of an interior node is defined in (8).

(8)

The constants \(C_{\mathrm{empty}}\) and \(C_{\mathrm{leaf}}\) serve the purpose of preventing indistinguishability between (non-)empty leaves, and the tree-wide constant \(C_{\mathrm{tw}}\) provides protection against an adversary in the multi-instance setting. In other words, if all MPTs use distinct tree-wide constants, no nodes’ pre-images can be valid across different trees. Similarly, no nodes’ pre-images can be valid across multiple locations because the leaves’ digests are uniquely encoded by \(j\Vert \ell \) and \(i \Vert \ell \) (the location of an interior node is implicit due to the children it commits to). Thus, as opposed to searching collisions across different trees and locations in parallel, an adversary must target a particular tree and location.

We also need to consider different versions of the trees that are generated by updates. To accomplish full \(\lambda \)-bit security for an instance of an MPT, a new tree-wide constant must be selected after each update to prevent parallel attacks through past versions of the same tree structure. This means that for all updates, the entire MPT has to be recomputed from scratch.

5.2 A Secure Encoding for Sparse Merkle Trees

Figure 6 defines a secure encoding for an SMT in the multi-instance setting. We prevent attacks across distinct trees by introducing a tree-wide constant \(C_{\text {tw}{}}\), but we do not protect against attacks on different versions of the same tree structure because \(C_{\text {tw}{}}\) is reused between updates. For attacks within a particular tree, we include unique identifiers in every non-empty subtree. This differs with respect to MPTs, but is necessary to preserve the sparse property of an SMT: if unique prefixes were included in the empty subtrees, then there would no longer be any default digests. As shown in (10), we solve this issue and retain security by moving the encoding of an empty node into the non-empty parent. An interior node that is non-default will still commit properly to a certain location encoded by base and heightFootnote 2, and since the digest of an empty node is publicly known even for an MPT no security is lost. Furthermore, note that we do not encode the attributes \(a_{0}\) and \(a_{1}\) explicitly in (9). Inclusion of the base suffices to distinguish between (non-)empty leaves, considering that the height of an SMT is implicit.

Fig. 6.
figure 6

Secure node encodings for an SMT.

5.3 Security Aspects of Caching Strategies

Generally speaking, we often distinguish between best, worst, and average case complexities. For instance, a hash table has amortized constant look-up time, but can degrade to a linear construction if all entries hash to the same bucket. Likewise, a binary search tree that is probabilistically balanced is in danger of breaking down into a linked list. Though critics might claim that attacks based on such degradations are of theoretical interest alone, Crosby and Wallach [7] have already presented denial of service attacks that exploit algorithmic complexities. Thus, within security, it is of great importance to evaluate worst case behavior.

Let us consider the \(\text {B}{}\) cache. In the worst case, if there are merely \(N{}\) keys, an adversary could force an almost perfect spine of branches as depicted in Fig. 7. Whenever membership proofs are issued for the leaves on that spine, the large majority of all the non-default digests must be computed because the siblings’ digests are not captured by the cache. While this is not an issue for a small SMT, the worst case efficiency actually increases as the tree grows: new insertions yield additional branches, and it is more efficient to stop traversals at a branch than at a leaf. In other words, there are two scenarios each time a sibling’s digest is requested. First, the digest is default and can be requested in constant time. Second, the digest is non-default and can be derived by traversing the tree down to a branch or leaf. In either case, regardless of how an adversary selects the keys, at most \(N{}\) traversals are necessary (one per layer). A similar analysis applies to \(\text {B}{}^{\text {+}}{}\), considering that the children of all branches are captured by the relative information. For \(\text {B}{}^{-}_{}\), one can show that the number of traversals will be bounded by \(\text {f}(n){}\). As such, to prevent an adversary from causing inefficiency, \(\text {f}(n){}\) must be either constant or unpredictable to the adversary.

Fig. 7.
figure 7

A branch spine, potentially caused by an adversary.

An almost identical analysis applies for worst case behavior during updates. This follows from the observation that (3) and (5) traverse the tree down to the leaves, invoking (2) on each layer.

6 Performance

We examined performance and space-time trade-offs experimentally using a proof-of-concept implementation in GoFootnote 3, selecting SHA-512/256 as the hash functionFootnote 4, a data structure \(\mathcal {D}{}\) that supports splitting in logarithmic time, and relative information that is maintained in constant time (a hash table). Our experiments were executed on an Intel(R) Core(TM) i7-4790 CPU at 3.60 GHz with 2\(\,\times \,\)8 GB DDR4 RAM, and they utilized Go’s built-in benchmarking tool. Furthermore, the \(\text {B}{}^{-}_{}\) cache was implemented probabilistically such that a branch is captured with probability p. We tested \(\text {B}{}^{-}_{}\) for \(p \in \{0.5\ldots 0.9\}\), denoted by \(\text {B}{}^{-}_{p}\), and included \(\text {B}{}\), \(\text {B}{}^{\text {+}}{}\), and a hash treap in our experiments. For the relevant operations, i.e., insertion, removal, and look-up, the expected logarithmic time complexity of a hash treap makes it a good representation of other authenticated data structures that are explicitly stored in memory.

Figure 8a shows the size of the authenticated data structure as a function of the data structure being authenticated. There is essentially no distinction between the two for a hash treapFootnote 5, and in the case of an SMT this is the relation between \({\delta {}^{}_{}}\) and \(\mathcal {D}{}\). For \(2^{20}\) keys, the hash treap needs 960 MiB, the \(\text {B}{}^{\text {+}}{}\) cache 512 MiB, the \(\text {B}{}\) cache 256 MiB, and the \(\text {B}{}^{-}_{0.5}\) cache 128 MiB. It is evident that the different caches double in size, and that the size of a hash treap is roughly eight times larger than that of a \(\text {B}{}^{-}_{0.5}\) cache. Furthermore, it should be noted that the \(\text {B}{}^{-}_{p}\) caches with \(p \in \{0.6\ldots 0.9\}\) have sizes evenly distributed in \([\text {B}{}^{-}_{0.5},\text {B}{}]\).

Figure 8b shows the time required to generate an audit path. Since the full structure is in memory for the hash treap, it is just a matter of copying the nodes along the path in negligible time (0.003 ms). Similarly, for \(\text {B}{}^{\text {+}}{}\) and \(\text {B}{}\), we see consistent results that are less than 1 ms regardless of how large \(\mathcal {D}{}\) is. This is because both caching strategies ensure that the vital non-default digests are cached, whereas additional recursive traversals down to either branches or leaves are necessary for \(\text {B}{}^{-}_{p}\). Finally, we observe the impact of selecting p. While \(p>0.6\) gives an expected time that is less than 4 ms, \(p=0.5\) behaves erratically. This follows from the high probability that a sibling’s digests must be derived instead of being found in the cache, as is also evident to a smaller extent for \(p=0.6\).

Fig. 8.
figure 8

Space-time trade-offs for caching strategies and a hash treap (HT).

Figure 8c shows the time it takes to update m keys in a data structure containing \(n=2^{15}\) keys. All approaches scale as \(\mathcal {O}({m\log {n}})\), with the hash treap being the fastest. Similarly, Fig. 8d shows the time it takes to update \(m=256\) keys as a function of the size n. The \(\text {B}{}^{\text {+}}{}\) cache consistently needs less than 20 ms, as opposed to the hash treap which needs 9.5 ms for \(n=2^{20}\). Considering that a hash treap consumes twice as much memory, this is indeed an interesting trade-off. For the remaining caching strategies, p together with the relation between n and m determines the probability of having cache misses. Simplified, larger p yields less variance and greater efficiency in terms of time.

7 Related Work

Google considers three categories of authenticated data structures when adding transparency to a trust model: verifiable logs, maps, and log-backed maps [12]. While CT relies on verifiable logs to support efficient consistency and membership proofs, verifiable maps based on SMTs are proposed in RT to prove non-membership. This is not without issues, however. All operations must be enumerated to determine whether a map’s state is correct. The former two categories are therefore combined into a verifiable log-backed map where consistency issues can be detected by the verifiable log, (non-)membership can be proven by the verifiable map, and full audits can ensure complete correct behavior. As such, using an efficient verifiable map based on our extension of an SMT, the combination of CT and RT can prove whether a certificate’s status is current. Other CT-like proposals that an SMT could be applicable to include Distributed Transparent Key Infrastructure [35] and Enhanced Certificate Transparency [32].

Verifiable maps are closely related to persistent authenticated dictionaries (PADs) [10]. While both are dynamic, the difference is that a PAD supports (non-)membership queries to current and past versions of the data structure. By extending our representation of an SMT to a secure key-value store, adding some form of persistency yields a PAD. Crosby and Wallach [9] investigated caching strategies for tree-based PADs in conjunction with Sarnak and Tarjan versioned nodes [33]. Before that, Anagnostopoulos et al. [1] considered another technique known as path copying. We could use similar approaches for the cache in our SMT, relying entirely on existing persistent data structures to yield a PAD.

CONIKS is a privacy-preserving key-management service that allows clients to monitor their own key-bindings efficiently [20]. An MPT (see Sect. 5.1) is used for the purpose of verifiability, but prior to deriving a unique index i the key-bindings are first transformed by a verifiable unpredictable function [22]. While that prevents audit paths from leaking user information, it cannot conceal the total number of users. CONIKS solves this issue and others e.g., ensuring fork consistency [19], by defining a protocol on top of an MPT. It appears that an SMT could be a viable and attractive replacement if viewed as a dictionary.

The issue of proving non-membership is not only evident in CT and RT. For instance, in the context of privacy-preserving transparency logging [31], Balloon plays an integral part as a provably secure append-only data structure [30]. This is accomplished using an approach towards authenticated data structures defined by Papamanthou et al. [28], as well as combining a history tree [8] and a hash treap [10, 30]. The former is essentially a verifiable log, and the latter a treap [2, 4] viewed as a Merkle tree. While hash treaps and SMTs share many properties, including efficient (non-)membership proofs and history independent representations, there are some striking differences. To begin with, hash treaps store attributes in each node. Unlike in an SMT, information regarding these attributes must be provided in an audit path due to encoding digests differently (possibly leaking valuable information). There will also be exactly n nodes at all times, and efficiency relies on a probabilistic balance. In these regards an SMT is flexible: the variable parameters \(\mathcal {D}\) and \(\delta \) determine if/when efficiency is provided, and memory requirements can be reduced to less than n if need be.

More generally we could compare an SMT to any lexicographically sorted data structure viewed as a Merkle tree, e.g., including certificate revocation trees [14] and subsequent approaches based on 2–3 trees [24]. An SMT is superior to a certificate revocation tree because the update process cannot cause the entire tree structure to be recomputed. When compared to 2–3 trees and other balanced binary search trees, the analysis is similar to that of a hash treap. Note, however, that an SMT should not be confused with authenticated data structures that are unable to prove non-membership efficiently. This means that an SMT is not intended for applications such as Bitcoin [23]: the transactions of separate blocks are grouped together in Merkle trees for the purpose of efficient integrity guarantees, not the ability to prove certain transactions absent.

Finally, this work is an extension of the Bachelor’s thesis by Dahlberg [27]. Apart from improving terminology, we defined recursions for batch updates and reconstruction of root digests, as well as caching strategies based on branches. We also added a security evaluation for full (concrete) security in the multi-instance setting, provided a publicly available implementation that uses a memory safe language, and compared our results with a related authenticated data structure.

8 Conclusion

Our definition of an SMT builds upon and extends the principles provided by Laurie and Kasper [17]. The proposal is generic in the sense that an arbitrary data structure supporting insertion, removal, look-up, and splitting can be used, and different caching strategies (\(\text {B}{}\), \(\text {B}{}^{-}_{}\), and \(\text {B}{}^{\text {+}}{}\)) provide fine-grained control over consumed space contra run time. In other words, rather than having an explicit tree structure, the resulting SMT is simulated. While this comes at the cost of additional computation when compared to other explicit tree-based data structures, our performance benchmark and worst case analysis show that our definitions are efficient regardless of how an adversary selects the keys. In addition, we prove that these definitions are secure in the multi-instance setting.

There is nothing that prevents further space-time trade-offs as an SMT evolves. In principle, the relation \(\text {B}{}^{-}_{} \subset \text {B}{} \subset \text {B}{}^{\text {+}}{}\) holds. Therefore, it is simple to go from one strategy to another, e.g., depending on how much memory is available at the time being. This is a major difference with respect to explicit tree structures, which have no previous constructions that are alike. Furthermore, the succinct recursions used to simulate an SMT yield limited implementation complexity, and history independence is a prevalent property if parallelized and distributed solutions are considered for large-scale applications.