1 Introduction

Secure communication, namely allowing Alice and Bob to exchange messages securely, over an untrusted communication channel, without having to trust any intermediate component or party, is perhaps the quintessential cryptographic problem. Indeed, constructing and breaking secure communication protocols, as well as modeling security concerns and guarantees, providing a security analysis, and then breaking the modeling and analysis, has been a mainstay of cryptography since its early days.

Successful secure communication protocols have naturally been built to secure existing communication patterns. Indeed, IPSec has been designed to provide IP-layer end-to-end security for general peer-to-peer communication without the need to trust routers and other intermediaries, while SSL (which evolved into TLS) has been designed to secure client-server interactions, especially in the context of web browsing, and PGP has been designed to secure email communication.

Securing the communication over messaging applications poses a very different set of challenges, even for the case of pairwise communication (which is the focus of this work). First, the communicating parties do not typically have any direct communication connection and may not ever be online at the same time. Instead, they can communicate only via an untrusted server. Next, the communication may be intermittent and have large variability in volumes and level of interactivity. At the same time, a received message should be processed immediately and locally. Furthermore, connections may span very long periods of time, during which it is reasonable to assume that the endpoint devices would be periodically hacked or otherwise compromised – and hopefully later regain security.

The Signal protocol has been designed to give a response to these specific challenges of secure messaging, and in doing so it has revolutionized the concept of secure communication over the Internet in many ways. Built on top of predecessors like Off-The-Record [14], the Signal protocol is currently used to transmit hundreds of billions of messages per day [49].

Modeling the requirements of secure messaging in general, and analyzing the security properties of the Signal protocol in particular, has proved to be challenging and has inspired multiple analytical works [1,2,3, 7, 10, 11, 13, 15, 17, 25,26,27,28,29,30,31,32,33, 35,36,37, 46,47,48, 52,53,55, 57]. Some of these works directly address the Signal architecture and realization, whereas others propose new cryptographic primitives that are inspired by Signal’s various modules.

The Need for Composable Security Analysis. Standalone security analyses of the Signal protocol are not always sufficient to capture the security of an entire messaging ecosystem that includes (components of) the Signal protocol. People typically participate concurrently in several conversations spanning several multi-platform chat services (e.g., smartphone and web), and the subtleties between a chat service and the underlying messaging protocol have led to network and systems security issues (e.g., [31, 32, 40]). For example, the Signal protocol is combined with other cryptographic protocols in WhatsApp [56] to perform abuse reporting or Status [50] and Slyo [51] to perform cryptocurrency transactions and Tor-style onion routing.

Moreover, Signal isn’t always employed as a single monolithic protocol. Rather, variations and subcomponents of the Signal protocol are used within the Noise protocol family [45], file sharing services like Keybase [38] (which performs less frequent ratcheting), and videoconferencing services like Zoom [39] (which isn’t concerned with asynchrony).

This state of affairs seems to call for a security analysis within a framework that allows for modular analysis and composable security guarantees. First steps in this direction were taken by the work of Jost, Maurer, and Mularczyk [37] that defines an abstract ratcheting service within the Constructive Cryptography framework [41, 42], and concurrent work by Bienstock et al. [12] that formulates an ideal functionality of the Signal protocol within the UC framework (see Sect. 1.5 for details). However, neither of these works give a modular decomposition of Signal into its basic components (as described in [44].)

The Apparent Non-modularity of Signal. One of the main sticking points when modeling and analyzing Signal in a composable fashion is that the protocol purposefully breaks away from the traditional structure of a short-lived “key exchange” module followed by a longer-lived module that primarily encrypts and decrypts messages using symmetric authenticated encryption. Instead, it features an intricate “continuous key exchange” module where shared keys are continually being updated, in an effort to provide forward security (i.e., preventing an attacker from learning past messages), as well as enabling the parties to quickly regain security as soon as the attacker loses access. Furthermore, Signal’s process of updating the shared keys crucially depends on feedback from the “downstream” authenticated encryption module. This creates a seemingly inherent circularity between the key exchange and the authenticated encryption modules, and gets in the way of basing the security of Signal on traditional components such as authenticated symmetric encryption, authenticated key exchange, and key-derivation functions.

Security of Signal in Face of Adaptive Corruptions. Another potentially thorny aspect of the security of secure messaging protocols (Signal included) is the need to protect against an adversary that decides whom and when to corrupt, adaptively, based on all the communication seen so far. Indeed, not only is standard semantic security not known to imply security in this setting: there exist encryption schemes that are semantically secure (under reasonable intractability assumptions) but completely break in such a setting [34].

1.1 This Work

This work proposes a modular analysis of the Signal protocol and its components using the language of universally composable (UC) security [18, 19]. We focus on modeling Signal at the level specified in their documentation [44] (i.e., not limited to any single choice, of cipher suite), taking care to adhere to the abstractions within the specification.

We provide an ideal functionality, \(\mathcal {F} _{\textsf{SM}} \), for secure messaging along with individual ideal functionalities that capture each module within Signal’s architecture. We then compose the modules to realize the top-level secure messaging functionality and demonstrate how to realize the modules in a manner consistent with the Signal specification [44]. Our instantiation achieves adaptive security against transient corruptions while making minimal use of the random oracle model. This combination of composability and modularity makes Signal and its components conveniently plug-and-play: future analyses can easily re-purpose or swap out instantiations of the modules in this work without needing to redo most of the security analysis.

In the process, we propose a new abstraction for Signal’s continuous key derivation module, which we call a Cascaded PRF-PRG (CPRFG), and we show that it suffices for Signal’s continuous key exchange module to achieve adaptive security. We also show how to construct CPRFGs from PRGs and puncturable PRFs. This may be of independent interest.

The rest of the Introduction is organized as follows. Section 1.2 presents and motivates our formulation of \(\mathcal {F} _{\textsf{SM}} \). Section 1.3 presents and motivates the formulation of the individual modules, and describes how these modules can be realized. Section 1.5 discusses related work.

1.2 On the Ideal Secure Messaging Functionality, \(\mathcal {F} _{\textsf{SM}} \)

We provide an ideal functionality \(\mathcal {F} _{\textsf{SM}} \) that captures end-to-end secure messaging, with some Signal-specific caveats. The goal here is to provide idealized security guarantees that will allow the analysis of existing protocols that use Signal, as well as enable Signal (or any protocol that realizes \(\mathcal {F} _{\textsf{SM}} \)) to be readily usable as a component within other protocols in security-preserving manner.

When a party asks to encrypt a message, \(\mathcal {F} _{\textsf{SM}} \) returns a string to the party that represents the encapsulated message. When a party asks to decrypt (and provides the representative string), the functionality checks whether the provided string matches a prior encapsulation, and returns the original message in case of a match. The encapsulation string is generated via adversarially provided code that doesn’t get any information about the encapsulated message, thereby guaranteeing secrecy.

Simple User Interface. The above encapsulation and decapsulation requests are the only ways that a parent protocol interacts with \(\mathcal {F} _{\textsf{SM}} \). In particular, the parent protocol is not required to keep state related to the session, such as epoch-ids or sequence numbers. In addition to simplicity, this imparts the additional guarantee that a badly designed parent protocol cannot harm the security of a protocol realising \(\mathcal {F} _{\textsf{SM}} \).

Abstracting Away Network Delivery. The fact that \(\mathcal {F} _{\textsf{SM}} \) models a secure messaging scheme as a set of local algorithms (an encapsulation algorithm and a decapsulation one) substantially simplifies traditional UC modeling of secure communication, where the communication medium is modeled as part of the service provided by the protocol and the actual communication is abstracted away.

Furthermore, the fact that \(\mathcal {F} _{\textsf{SM}} \) returns to the parent protocol an actual string (that represents an idealized encapsulated message) allows the parent protocol to further process the string as needed, similarly to what’s done in existing systems.

Immediate Decryption. \(\mathcal {F} _{\textsf{SM}} \) guarantees that message decapsulation requests are fulfilled locally on the receiver’s machine, and are not susceptible to potential network delays. Furthermore, this holds even if only a subset of the messages arrive, and arrival is out of order (as formalized in [1]). To provide this guarantee within the UC framework, we introduce a mechanism that enables \(\mathcal {F} _{\textsf{SM}} \) to execute adversarially provided code, without enabling the adversary to prevent immediate fulfillment of a decapsulation request. See more details in Sect. 2.

Modeling of PKI and Long Term Keys. We directly model Signal’s specific design for the public keys and associated secret keys that are used to identify parties across multiple sessions. Specifically, we formulate a “PKI” functionality \(\mathcal {F} _{\textsf{DIR}} \) that models a public “bulletin board,” which stores the long-term, ephemeral, and one-time public keys associated with identities of parties. In addition, we model “long term private key” module \(\mathcal {F} _{\textsf{LTM}} \) for each identity. This module stores the private keys associated with the public keys of the corresponding party. Both functionalities are modeled as global, namely they are used as subroutines by multiple instances of \(\mathcal {F} _{\textsf{SM}} \). This modeling is what allows to tie the two participants of a session to long-term identities. Similarly to [16, 24], we treat these modules as incorruptible. It is stressed, however, that, following the Signal architecture, our realization of \(\mathcal {F} _{\textsf{SM}} \) calls the \(\mathcal {F} _{\textsf{LTM}} \) module of each party exactly once, at the beginning of the session.

Modelling Corruptions. Resilience to recurring but transient break-ins is one of the main design goals of Signal. We facilitate the exposition of these properties as follows. First, we model corruption as an instantaneous event where the adversary learns the entire state of the corrupted party.

The security guarantees for corruption and recovery are then specified as follows. When the adversary instructs \(\mathcal {F} _{\textsf{SM}} \) to corrupt a party, it is provided all the messages that have been sent to that party and were not yet received. In addition, the party is marked as compromised until a certain future point in the execution. While compromised, all the messages sent and received by the party are disclosed to the adversary, who can also instruct \(\mathcal {F} _{\textsf{SM}} \) to decapsulate ciphertexts to any plaintext of its choice. This captures the fact that as long as any one of the parties is compromised, neither party can securely authenticate incoming messages.

Forward secrecy guarantees that the adversary learns nothing about any messages that have been sent and received by the party until the point of corruption. Furthermore, the adversary obtains no information on the history of the session such as its duration or the long term identity of the peer. In \(\mathcal {F} _{\textsf{SM}} \), this is guaranteed because corruption does not provide the adversary with any messages that were previously sent and successfully received.

On the other hand, the specific point by which a compromised party regains its security is Signal-specific and described in more detail within. After this point, the adversary no longer obtains the messages the messages sent and received by the parties; furthermore, the adversary can no longer instruct \(\mathcal {F} _{\textsf{SM}} \) to decapsulate forged ciphertexts.

Resilience to Adaptive Corruptions. All the security guarantees provided by \(\mathcal {F} _{\textsf{SM}} \) hold in the presence of an adversary that has access to the entire communication among the parties and adaptively decides when and whom to corrupt based on all the communication seen so far. In particular, we do not impose any restrictions on when a party can be corrupted.

Signal-Specific Limitations. The properties discussed so far relate to the general task of secure messaging. In addition, \(\mathcal {F} _{\textsf{SM}} \) incorporates the following two relaxations that represent known weaknesses that are specific to the Signal design.

First, Signal does not give parties a way to detect whether their peers have received forged messages in their name during corruption. (Such situations may occur when either party was corrupted in the past and then recovered.) This represents a known weakness of Signal [15, 31]. Consequently, \(\mathcal {F} _{\textsf{SM}} \) exhibits a similar behavior.

Second, as remarked in the Signal documentation [44], when one of the parties is compromised, an adversary can “fork” the messaging session. That is, the adversary can create a person-in-the-middle situation where both parties believe they are talking with each other in a joint session, and yet they are actually both talking with the adversary. Furthermore, this can remain the case indefinitely, even when no party is compromised anymore. (In fact, we know this situation is inherent in an unauthenticated network with transient attacks, at least without repeated use of a long-term uncompromised public key [20].) While such a situation is mentioned in the Signal design documents, pinpointing and analyzing the conditions under which forking occurs has not been formally done before our work and the concurrent work by Bienstock et al. [12]. In our modeling, \(\mathcal {F} _{\textsf{SM}} \) forks when one of the parties is compromised, and at the same time the other party successfully decapsulates a forged incoming message with an “epoch ID” that is different than the one used by the sender. In that case, \(\mathcal {F} _{\textsf{SM}} \) remains forked indefinitely, without any additional corruptions.

1.3 Realizing \(\mathcal {F} _{\textsf{SM}} \), Modularly

Signal’s strong forward secrecy and recovery from compromise guarantees are obtained via an intricate mechanism where shared keys are continually being updated, and each key is used to encapsulate at most a single message.

To help keep the parties in sync regarding which key to use for a given message, the conversation is logically partitioned into sending epochs, where each sending epoch is associated with one of the two parties, and consists of all the messages sent by that party from the end of its previous sending epoch until the first time this party successfully decapsulates an incoming message that belongs to the peer’s latest sending epoch.

Within each sending epoch, the keys are pseudorandomly generated one after the other in a chain. The initial chaining key for each epoch is generated from a ‘root chain’ that ratchets forward every time a new sending epoch starts. Each ratcheting of the root chain involves a Diffie-Hellman key exchange; the resulting Diffe-Hellman secret is then used as input to the root ratchet (along with an existing chaining value). The public values of each such Diffie-Hellman exchange are piggybacked on the messages within the epoch and therefore authenticated using the same AEAD used for the data. Furthermore, these public values are used as unique identifiers of the sending epoch that each message is a part of. This mechanism allows the parties to keep in sync without storing any long-term information about the history of the session.

The Signal architecture document [44] de-composes the above mechanism into 3 main cryptographic modules, plus non-cryptographic code used to put these modules together. The modules are: (1) a symmetric authenticated encryption with associated data (AEAD) scheme that is applied to individual messages; (2) a symmetric key ratcheting mechanism to evolve the key between messages within an epoch; (3) an asymmetric key ratcheting (or “continuous key exchange”) mechanism to evolve the “root chain.” Since these modules are useful for applications beyond this particular protocol, we follow this partitioning and decompose Signal’s protocol into similar components. (Our partitioning into components is also inspired by that of Alwen et al. [1].)

We model the security of each component as an ideal functionality within the UC framework. (These are \(\mathcal {F} _{\textsf{aead}},\mathcal {F} _{\textsf{mKE}},\mathcal {F} _{\textsf{eKE}} \), respectively.) This allows us to distill the properties provided by each module and demonstrate how they can be composed, along with the appropriate management code to obtain the desired functionality—namely to realize \(\mathcal {F} _{\textsf{SM}} \). The management code (specifically, protocols \(\varPi _{\mathsf {fs\_aead}} \) and \(\varPi _{\textsf{SGNL}} \)), does not directly access any keying material. Indeed, these protocols realise their respective specifications, namely \(\mathcal {F} _{\mathsf {fs\_aead}} \) and \(\mathcal {F} _{\textsf{SM}} \), perfectly—see Theorems 1 and 2.

Before proceeding to describe the modules in more detail, we highlight the following apparent circularity in the security dependence between these modules: the messages in each sending epoch need to be authenticated (by the AEAD in use) using a key k that’s derived from the message itself. Thus, modular security analysis along the above partitioning to modules might initially appear to be impossible.

The critical observation that allows us to proceed with modular decomposition is that the continuous key exchange module (which in our modeling corresponds to \(\mathcal {F} _{\textsf{eKE}} \)) need not determine the authenticity of new epoch identifiers. Rather, this module is only tasked to assign a fresh pseudorandom secret key with each new epoch identifier, be it authentic or not. The determination of whether a new purported epoch identifier is authentic (or a forgery caused by an adversarially generated incoming message) is done elsewhere – specifically at the management level.

We proceed to provide a more detailed overview of our partitioning and the general protocol logic. See also Fig. 1.

Fig. 1.
figure 1

Modeling and realizing secure messaging: The general subroutine structure. Ideal functionalities are denoted by \(F_{}\) and protocols by \(\varPi \). Thin vertical arrows denote subroutine calls, whereas thick horizontal arrows denote realization. Functionalities \(\mathcal {F} _{\textsf{DIR}},\mathcal {F} _{\textsf{LTM}} ,\mathcal {F} _{\textsf{pRO}} \) are global with respect to \(\mathcal {F} _{\textsf{SM}} \), whereas \(\mathcal {F} _{\textsf{eKE}} \) (and \(\varPi _{\textsf{eKE}})\) are global for \(\mathcal {F} _{\textsf{mKE}} \), and each instance of \(\mathcal {F} _{\textsf{mKE}} \) (and the corresponding instance of \(\varPi _{\textsf{mKE}})\) are global for \(\mathcal {F} _{\textsf{aead}} \). (Color figure online)

\(\mathcal {F} _{\textsf{eKE}} \). The core component of the protocol is the epoch key exchange functionality \(\mathcal {F} _{\textsf{eKE}} \), which captures the generation of the initial shared secret key from the public information, as well as the continuous Diffie-Hellman protocol that generates the unique epoch identifiers and the “root chain” of secret keys. Whenever a party wishes to start a new epoch as a sender, it asks \(\mathcal {F} _{\textsf{eKE}} \) for a new epoch identifier, as well as an associated secret key. The receiving party of an epoch must present an epoch identifier, and is then given the associated secret key.

As mentioned, we allow the receiving party of a new epoch to present multiple potential epoch identifiers, and obtain a secret epoch key associated with each one of these identifiers. Furthermore, while only one of these keys is the one used by the sender for this epoch, all the keys provided by \(\mathcal {F} _{\textsf{eKE}} \) are guaranteed to appear random and independent to the adversary. In other words, \(\mathcal {F} _{\textsf{eKE}} \) leaves it to the receiver to determine which of the candidate identifiers for the new epoch is the correct one. (If \(\mathcal {F} _{\textsf{eKE}} \) recognizes, from observing the corruption activity and the generated epoch IDs, that the session has forked, then it exposes the secret keys to the adversary.) We postpone the discussion of realizing \(\mathcal {F} _{\textsf{eKE}} \) to the end of this section.

\(\mathcal {F} _{\textsf{mKE}} \). The per-epoch key chain is captured by an ideal functionality \(\mathcal {F} _{\textsf{mKE}} \) that is identified by an epoch-id, and generates, one at a time, a sequence of random symmetric keys associated with this epoch-id. The length of the chain is not a priori bounded; however, once \(\mathcal {F} _{\textsf{mKE}} \) receives an instruction to end the chain for a party, it complies. \(\mathcal {F} _{\textsf{mKE}} \) guarantees forward secrecy by making each key retrievable at most once by each party; that is, the key becomes inaccessible upon first retrieval, even for a corrupted party. However, it does not post-compromise security: once corrupted, all the future keys in the sequence are exposed to the adversary.

\(\mathcal {F} _{\textsf{mKE}} \) is realized by a protocol, \(\varPi _{\textsf{mKE}} \), that first calls \(\mathcal {F} _{\textsf{eKE}} \) with its current epoch-id, to obtain the initial chaining key associated with that epoch-id. The rest of the keys in this epoch are derived using a generic length-doubling PRG (of which Signal’s typical instantiation using HKDF is a special case).

Demonstrating that \(\varPi _{\textsf{mKE}} \) realizes \(\mathcal {F} _{\textsf{mKE}} \) is relatively straightforward, except for the need to address the fact that the same instance of \(\mathcal {F} _{\textsf{eKE}} \) is used by multiple instances of \(\varPi _{\textsf{mKE}} \). Using the formalism of [5], we thus show that \(\varPi _{\textsf{mKE}} \) UC-realizes \(\mathcal {F} _{\textsf{mKE}} \) in the presence of a global \(\mathcal {F} _{\textsf{eKE}} \).

\(\mathcal {F} _{\textsf{aead}} \). Authenticated encryption with associated data is captured by ideal functionality \(\mathcal {F} _{\textsf{aead}} \), which provides a one-time ideal authenticated encryption service: the encrypting party calls \(\mathcal {F} _{\textsf{aead}} \) with a plaintext and a recipient identity, and obtains an opaque ciphertext. Once the recipient presents the ciphertext, \(\mathcal {F} _{\textsf{aead}} \) returns the plaintext. (The recipient is given the plaintext only once, even when corrupted.) The “associated data,” namely the public part of the authenticated message, is captured via the session identifier of \(\mathcal {F} _{\textsf{aead}} \).

\(\mathcal {F} _{\textsf{aead}} \) is realized via protocol \(\varPi _{\textsf{aead}} \), which employs an authenticated encryption algorithm using a key obtained from \(\mathcal {F} _{\textsf{mKE}} \). If we had opted to assert security against non-adaptive corruptions, any standard AEAD scheme would do. However, we strive to provide simulation-based security in the presence of fully adaptive corruptions, which is provably impossible in the plain model whenever the key is shorter than the plaintext [43]. We get around this issue by realizing \(\mathcal {F} _{\textsf{aead}} \) in the programmable random oracle model. While we provide a very simple AEAD protocol in this model, many common block cipher-based AEADs can also realize \(\mathcal {F} _{\textsf{aead}} \) provided we model the block cipher as a programmable random oracle. It is stressed however that the random oracle is used only in the case of short keys and adaptive corruptions. In particular, when corruptions are non-adaptive or the plaintext is sufficiently short, our protocol continues to UC-realize \(\mathcal {F} _{\textsf{aead}} \) even when the random oracle is replaced by the identity function.

Since each instance of \(\mathcal {F} _{\textsf{mKE}} \) is used by multiple instances of \(\varPi _{\textsf{aead}} \), we treat \(\mathcal {F} _{\textsf{mKE}} \) as a global functionality with respect to \(\varPi _{\textsf{aead}} \). That is, we show that \(\varPi _{\textsf{aead}} \) UC-realizes \(\mathcal {F} _{\textsf{aead}} \) in the presence of (a global) \(\mathcal {F} _{\textsf{mKE}} \).

\(\mathcal {F} _{\mathsf {fs\_aead}} \). Functionality \(\mathcal {F} _{\mathsf {fs\_aead}} \) is an abstraction of the management module that handles the encapsulation and decapsulation of all the messages within a single epoch. An instance of \(\mathcal {F} _{\mathsf {fs\_aead}} \) is created by the main module of Signal whenever a new epoch is created, with session ID that contains the identifier of this epoch. \(\mathcal {F} _{\mathsf {fs\_aead}} \) then provides encapsulation and decapsulation services, akin to those of \(\mathcal {F} _{\textsf{aead}} \), for all the messages in its epoch. In addition, once instructed by the main module that its epoch has ended, \(\mathcal {F} _{\textsf{aead}} \) no longer allows encapsulation of new messages—even when the party is corrupted.

\(\mathcal {F} _{\mathsf {fs\_aead}} \) is realized (perfectly, and in a straightforward way) by protocol \(\varPi _{\mathsf {fs\_aead}} \) that calls multiple instances of \(\mathcal {F} _{\textsf{aead}} \), plus an instance of \(\mathcal {F} _{\textsf{mKE}} \) for this epoch - where, again, the session ID of \(\mathcal {F} _{\textsf{mKE}} \) contains the current epoch ID.

\(\varPi _{\textsf{SGNL}} \). At the highest level of abstraction, we have each of the two parties run protocol \(\varPi _{\textsf{SGNL}} \). When initiating a session, or starting a new epoch within a session, (i.e., when encapsulating the first message in an epoch), \(\varPi _{\textsf{SGNL}} \) first calls \(\mathcal {F} _{\textsf{eKE}} \) to obtain the identifier of that epoch, then creates an instance of \(\mathcal {F} _{\mathsf {fs\_aead}} \) for that epoch ID and asks this instance to encapsulate the first message of the epoch. All subsequent messages of this epoch are encapsulated via the same instance of \(\mathcal {F} _{\mathsf {fs\_aead}} \).

On the receiver side, once \(\varPi _{\textsf{SGNL}} \) obtains an encapsulated message in a new epoch ID, it creates an instance of \(\mathcal {F} _{\mathsf {fs\_aead}} \) for that epoch ID and asks this instance to decapsulate the message. It is stressed that the epoch ID on the incoming message may well be a forgery; however in this case it is guaranteed that decapsulation will fail, since the peer has encapsulated this message with respect to a different epoch ID, namely a different instance of \(\mathcal {F} _{\mathsf {fs\_aead}} \). (This is where the circular dependence breaks: even though the environment may invoke \(\varPi _{\textsf{SGNL}} \) on arbitrary incoming encapsulated message, along with related epoch IDs, \(\mathcal {F} _{\mathsf {fs\_aead}} \) is guaranteed to reject unless the encapsulated message uses the same epoch ID as the as actual sender. Getting under the hood, this happens since the instances of \(\mathcal {F} _{\textsf{mKE}} \) that correspond to different epoch IDs generate keys that are mutually pseudorandom.) IT is stressed that \(\varPi _{\textsf{SGNL}} \) is purely “management code” in the sense that it only handles idealized primitives and does not directly access cryptographic keying material. Commensurately, it UC-realizes \(\mathcal {F} _{\textsf{SM}} \) perfectly.

Realizing \(\mathcal {F} _{\textsf{eKE}} \). Recall that \(\mathcal {F} _{\textsf{eKE}} \) is tasked to generate, at the beginning of each new epoch, multiple alternative keys for that epoch – a key for each potential epoch-id for that epoch. This should be done while preserving simulatability in the presence of adaptive corruptions.

Following the Signal architecture, the main component of the protocol that realizes \(\mathcal {F} _{\textsf{eKE}} \) is a key derivation function (KDF) that combines existing secret state, with new public information (namely the public Diffie-Hellman exponents, which also double-up as an epoch-id), and a new shared key (the corresponding Diffie-Hellman secret), to obtains a new secret key associated with the given epoch-id, along with potential new local secret state for the KDF.

If the KDF is modeled as a random oracle then it is relatively straightforward to show that the resulting protocol UC-realizes \(\mathcal {F} _{\textsf{eKE}} \).

On the other extreme, it can be seen that no plain-model instantiation of the KDF module, with bounded-size local state, can possibly realize \(\mathcal {F} _{\textsf{eKE}} \) in our setting. Indeed, since the adversary can obtain unboundedly many alternative keys for a given epoch, where all keys are generated using the same bounded-size secret state, the Nielsen bound [43] applies.

We propose a middle-ground solution: we show how to instantiate the KDF via a plain-model primitive where the local state grows linearly with the number of keys requested from \(\mathcal {F} _{\textsf{eKE}} \) at the beginning of a given epoch. Once the epoch advances, the state shrinks back to its original size. Our instantiation uses standard primitives: pseudorandom generators and puncturable pseudorandom functions. We also abstract the properties of our construction into a primitive which we call cascaded pseudorandom function and generator (CPRFG), following a primitive of [1] that is used for a similar purpose. We stress however that technically the primitives are quite different; we elaborate in the related work section.

1.4 Streamlining UC Analysis

We highlight two additional modeling and analytical techniques that we used to simplify the overall analysis. We hope that these would be useful elsewhere.

Multiple Levels of Global State. Our analysis makes extensive use of universal composition with global state (UCGS) within the plain UC model, as formulated and proven in [5]. Specifically, we use UCGS to model a global directory that holds the public keys of parties, as well as the long-term storage, within each party, of the secret keys associated with said public keys. Similarly, we use UCGS to model the fact that a single instance of \(\mathcal {F} _{\textsf{eKE}} \) is used by multiple instances of \(\varPi _{\textsf{mKE}} \), and that a single instance of \(\mathcal {F} _{\textsf{mKE}} \) is used by multiple instances of \(\varPi _{\textsf{aead}} \). The random oracle is also modeled as a global functionality.

To facilitate our multi-layer use of the UCGS theorem we also prove a simple-but-useful lemma that allows us to get around the following difficulty. Recall that the UCGS theorem allows demonstrating that protocol \(\pi \) UC-realizes functionality \(\mathcal F\) in the presence of some other ‘global’ functionality \(\mathcal G\) that takes inputs from \(\pi \), \(\mathcal F\), and also potentially directly from the environment. Furthermore, we would like to use multiple levels of UCGS: after showing that \(\pi \) UC-realizes \(\mathcal F\) in the presence of \(\mathcal G\), we wish to argue that \(\pi \) UC-realizes \(\mathcal F\) in the presence of protocol \(\gamma \), where \(\gamma \) is some protocol that UC-realizes \(\mathcal G\). However, such implication is not true in general [6, 24].

Lemma 1 in Sect. 2 asserts that, if \(\gamma \) UC-realizes \(\mathcal G\) via some simulator \(\mathcal S\), then for any \(\pi \) that UC-realizes \(\mathcal F\) in the presence of \(\mathcal G\), it also holds that \(\pi \) UC-realizes \(\mathcal F\) in the presence of \(\mathcal G_\mathcal{S}\), where \(\mathcal G_\mathcal{S}\) be the functionality that combines \(\mathcal G\) and \(\mathcal S\) in the natural way. We then show that, for the protocols in this work, having access to \(\mathcal G_\mathcal{S}\) suffices.

Multiple Levels of Corruptions. The UC framework allows the adversary to adaptively and individually corrupt each party in each module within a composite protocol. While this is very general, it makes the handling of corruption events (where typically the internal states of multiple modules are exposed together) rather complex. We thus adopt a somewhat simpler modeling of party corruption: We let the environment directly corrupt parties and obtain their local states. Furthermore, a corrupted module forwards the corruption notice to all its subroutines and collects the local states of all to report to the environment. Ideal functionalities operate similarly, except that they ask their respective simulators for the appropriate simulated local states. In addition to being simpler, this modeling provide tighter correspondence between the real and ideal executions and is thus preferable whenever realizable (which is the case in this work).

1.5 Related Work

This section briefly surveys the state of the art for security analyses of the Signal architecture in particular and end-to-end secure messaging in general, highlighting the differences from and similarities to the present work.

There is a long line of research into the design and analysis of two-party Signal messaging, its subcomponents, and variants of the Signal architecture; this research builds upon decades of study into key exchange protocols (e.g., [8, 9, 22, 23]) and self-healing after corruption (e.g., [20, 28, 30]). Some of these secure messaging analyses purposely consider a limited notion of adaptive security in order to analyze instantiations of Signal based on standardized crypto primitives (e.g., [1, 10, 32, 36, 57]). Other works consider a strong threat model in which the adversary is malicious, fully adaptive, and can tamper with local state [4, 7, 35, 37, 46], which then intrinsically requires strong HIBE-like primitives that depart from the Signal specification. By contrast, we follow a middle ground in this work: our adversary is fully adaptive and has no restrictions on when it can corrupt a party, yet its corruptions are instantaneous and passive.

We stress that, while this work is inspired by the clear game-based modeling and analyses of Signal in works like Alwen et al. [1], our modeling differs in a number of significant ways. For one, our analysis provides a composable security guarantee. Furthermore, we directly model secrecy against a fully adaptive adversary that decides who and when to corrupt based on all the information seen so far. In contrast, Alwen et al. [1] guarantee secrecy only against a selective adversary that determines ahead of time who and when it will corrupt.

There are two prior works that perform composable analyses of Signal. In concurrent work to our own, Bienstock et al. [12] provide an alternative modeling of an ideal secure messaging within the UC framework and demonstrate how the Signal protocol can be modeled in a way that is shown to realize their formulation of ideal secure messaging. Like this work, they demonstrate several shortcomings of previous formulations, such as overlooking the effect of choosing keys too early or keeping them around for too long. They also propose and analyze an enhancement of the double ratchet structure that helps parties regain security faster following a compromise event. Additionally, Jost, Maurer, and Mularczyk [37] conduct an analysis in the constructive cryptography framework. Their work provides a model for message transmission as well as one for ratcheting protocols.

That said, the ideal functionalities in [12] and [37] differ from our \(\mathcal {F} _{\textsf{SM}} \) in a number of ways. First, their modeling does not account for the session initiation process, nor the PKI and long-term key modules that are an integral part of any secure messaging application. Second, they include the communication medium as part of the protocol, which (a) makes it harder to argue about immediate decryption and (b) means that an instantiation of Signal would have to include an entire TCP/IP stack, which weakens modularity and inhibits the use of Signal as a sub-routine within larger functionalities.

Additionally, Bienstock et al. [12] models all key derivation modules as random oracles rather than formalizing the partition of continuous key exchange components within the UC framework as done in this work, and their modeling forces the “calling protocol” to keep track of—and ensure uniqueness for—the message IDs for the Secure Messaging functionality/protocol, which might create a security risk. On the other hand, [12] accounts for adversarial choice of randomness, which our modeling does not account for. Also, Jost et al. [37] requires explicit modeling of a global event history (a list of events having happened at each module), restricts the real-world adversary’s events based on this global event history, and employs a HIBE-based implementation that is quite different than that of Signal (and ours) and requires heavier cryptographic primitives.

2 Universally Composable Security: New Capabilities

This work makes extensive use of UC with global subroutines, which allow analysis that a protocol \(\varPi _{\mathsf {}} \) UC-realizes functionality \(\mathcal {F} _{\mathsf {}} \) in the presence of a global subroutine G that is not subroutine-respecting. Due to space constraints, we defer a primer of UC security (with global subroutines) to the full version of this work [21]. In this space, we describe two new modeling techniques that simplify our analysis, and may be of more general interest.

The first technique relates to applying the UC theorem to global functionalities. As stated in Sect. 1.4, the analysis in this work requires the ability to apply composition with global state across multiple layers of Fig. 1. We prove the following lemma in the full version of this work [21].

Fig. 2.
figure 2

The code library functionality, \(\mathcal {F} _{\textsf{lib}} \)

Lemma 1

Let \(\varPi _{\mathsf {}} \) be a protocol that UC-realizes an ideal functionality \(\mathcal {F} _{\mathsf {}} \), and let \(\mathcal {S} \) be a simulator that demonstrates this fact, i.e. \(\text{ exec}_{{\mathcal E},\varPi _{\mathsf {}}} \approx \text{ exec}_{{\mathcal E},\mathcal {F} _{\mathsf {}},\mathcal {S} }\). Then protocols \(\varPi _{\mathsf {}} \) and \({\mathcal {F} _{\mathsf {}}}_{\mathcal {S} }\) UC-emulate each other. Consequently, for any protocol \(\rho \) and ideal functionality \(\Gamma \) we have that \(\rho \) UC-realizes \(\Gamma \) in the presence of \(\varPi _{\mathsf {}} \) if and only if \(\rho \) UC-realizes \(\Gamma \) in the presence of \({\mathcal {F} _{\mathsf {}}}_{\mathcal {S} }\).

The second technique is that, in order to model the immediate encryption and immediate decryption properties of secure messaging, we require the adversary to upload static code (which we call \(\mathcal {I} \)) to the global \(\mathcal {F} _{\textsf{lib}} \)—shown in Fig. 2—that the relevant functionality will run in honest cases of the execution. This static code is specific to the protocol that realizes the functionality, and essentially it acts as the ideal-world simulator during an honest execution. This ensures that the functionality does not need to wait for the adversary to encrypt or decrypt messages that are not corrupted. In cases where the message or ciphertext is corrupted, the fully adaptive adversary is called for input (for example, asking \(\mathcal {A}\) to encrypt a message or decrypt a ciphertext). The state of the static code \(\mathcal {I} \) is maintained across calls in a variable \(\textsf{state} _\mathcal {I} \), and it is sent to the adversary upon corruption. \(\mathcal {F} _{\textsf{lib}} \) is global because the static code must be defined at the time that the functionality is instantiated; however, the use of \(\mathcal {F} _{\textsf{lib}} \) specifically is mainly a matter of plumbing rather than a topic of conceptual importance.

3 Formal Modeling and Analysis

In this section, we showcase our modular, iterative process for decomposing the ideal secure messaging functionality \(\mathcal {F} _{\textsf{SM}} \) into a collection of functionalities and protocols that each address one specific purpose. After fully specifying \(\mathcal {F} _{\textsf{SM}} \) itself, we present its realization at the “second level” of Fig. 1 by \(\varPi _{\textsf{SGNL}} \), \(\mathcal {F} _{\mathsf {fs\_aead}} \), and \(\mathcal {F} _{\textsf{eKE}} \), and at the “third level” by functionalities \(\mathcal {F} _{\textsf{aead}} \) and \(\mathcal {F} _{\textsf{mKE}} \).

Due to space constraints, we only give brief descriptions of these functionalities here and defer more exposition to the full version of this work [21]. The full version also contains rigorous specifications of the remaining protocols (on the far right of Fig. 1) and all underlying global functionalities (in blue at the bottom of Fig. 1), along with proofs of all theorems in this section.

Secure Messaging Functionality. Our top-level functionality \(\mathcal {F} _{\textsf{SM}} \) can be found in Figs. 3 to 4. It takes two types of inputs: \(\texttt {SendMessage}\) is used to encapsulate a message for sending to the peer, whereas \(\texttt {ReceiveMessage}\) is used to decapsulate a received message. We also have a \(\texttt {Corrupt}\) input; this is a ‘modeling input’ that is used to capture party corruption. In addition, \(\mathcal {F} _{\textsf{SM}} \) takes a number of ‘side channel’ messages from the adversary which are used to fine-tune the security guarantees. It relies on three global functionalities whose specifications are provided in the full version of this work [21]: \(\mathcal {F} _{\textsf{DIR}} \) representing the directory of public keys, \(\mathcal {F} _{\textsf{LTM}} \) representing the long-term key storage within a party, and a programmable random oracle \(\mathcal {F} _{\textsf{pRO}} \).

Fig. 3.
figure 3

The Secure Messaging Functionality \(\mathcal {F} _{\textsf{SM}} \)

Fig. 4.
figure 4

The Secure Messaging Functionality \(\mathcal {F} _{\textsf{SM}} \)

Fig. 5.
figure 5

The Signal Protocol, \(\varPi _{\textsf{SGNL}} \)

Fig. 6.
figure 6

The Epoch Key Exchange Functionality, \(\mathcal {F} _{\textsf{eKE}}\)

Fig. 7.
figure 7

The Epoch Key Exchange Functionality, \(\mathcal {F} _{\textsf{eKE}}\) (continued)

Fig. 8.
figure 8

The Forward-Secure Encryption Functionality \(\mathcal {F} _{\mathsf {fs\_aead}} \)

Fig. 9.
figure 9

The Forward-Secure Encryption Functionality \(\mathcal {F} _{\mathsf {fs\_aead}} \) (continued)

The Double Ratchet. In our first layer, we decompose \(\mathcal {F} _{\textsf{SM}} \) into two components that model the interconnected pieces of the double ratchet: a public key exchange component \(\mathcal {F} _{\textsf{eKE}} \) and a symmetric key authenticated encryption component \(\mathcal {F} _{\mathsf {fs\_aead}} \). These components are ‘glued’ together with a manager protocol \(\varPi _{\textsf{SGNL}} \).

There are three primary takeaways from the design of \(\varPi _{\textsf{SGNL}} \) (Fig. 5): it has the same input-output API as our ideal functionality \(\mathcal {F} _{\textsf{SM}} \), it displays a idealized version of the double ratchet with clearly distinct roles for the two ratcheting subroutines, and finally it moves closer toward realism. Added features at this level of abstraction include key material stored within party states, explicit accounting for out-of-order messages by holding onto missed message keys, and epochs being identified directly by their \(\mathsf {epoch\_id} \) rather than an idealized \(\mathsf {epoch\_num} \) ordering.

The epoch key exchange functionality \(\mathcal {F} _{\textsf{eKE}}\) (Figs. 6 to 7) comprises the public key “backbone” of the secure messaging continuous key agreement. The functionality is persistent during the entire session, mapping (\(\mathsf {epoch\_id} _0, \mathsf {epoch\_id} _1)\) pairs to sending and receiving chain keys for the symmetric ratchet. It also provides recovery from a state compromise (aka, post-compromise security).

The forward secure authenticated encryption functionality \(\mathcal {F} _{\mathsf {fs\_aead}}\) (Figs. 8 to 9) models the symmetric key ratchet for secure messaging. Each \(\mathcal {F} _{\mathsf {fs\_aead}}\) instance handles the encryption and decryption of messages for a single epoch. The protocol \(\varPi _{\mathsf {fs\_aead}}\) realizes \(\mathcal {F} _{\mathsf {fs\_aead}}\) by outsourcing authenticated encryption and decryption of each message to separate \(\mathcal {F} _{\textsf{aead}}\) instances, described below.

Theorem 1

Protocol \(\varPi _{\textsf{SGNL}} \) (perfectly) UC-realizes the ideal functionality \(\mathcal {F} _{\textsf{SM}} \) in the presence of \(\mathcal {F} _{\textsf{DIR}} \) and \(\mathcal {F} _{\textsf{LTM}} \).

The Symmetric Ratchet: Realizing \(\boldsymbol{\mathcal {F} _{\mathsf {fs\_aead}}}\). Next, we decompose the symmetric key component of Signal into two smaller pieces: a one-time-use authenticated encryption routine \(\mathcal {F} _{\textsf{aead}} \), and a message key exchange functionality \(\mathcal {F} _{\textsf{mKE}} \) that interfaces with the epoch key exchange to produce the symmetric chain keys.

Each authenticated encryption functionality instance \(\mathcal {F} _{\textsf{aead}}\) (Fig. 10) handles the encryption, decryption, and authentication of a particular message for a particular epoch. It hands the ciphertext or message back to \(\varPi _{\mathsf {fs\_aead}}\).

Theorem 2

Protocol \(\varPi _{\mathsf {fs\_aead}} \) (perfectly) UC-realizes \(\mathcal {F} _{\mathsf {fs\_aead}} \), in the presence of \(\mathcal {F} _{\textsf{DIR}},\mathcal {F} _{\textsf{LTM}} ,\mathcal {F} _{\textsf{pRO}} ,\) and \(\mathcal {F} _{\textsf{eKE}} ^{\varPi _{\textsf{eKE}}}=(\mathcal {S} _{\textsf{eKE}},\mathcal {F} _{\textsf{eKE}})\).

Note that the simulator \(\mathcal {S} _{\textsf{eKE}} \), along with a proof of this theorem, are deferred to the full version of this work [21].

Fig. 10.
figure 10

The Authenticated Encryption with Associated Data Functionality, \(\mathcal {F} _{\textsf{aead}} \)

Each instance of the message key exchange functionality \(\mathcal {F} _{\textsf{mKE}}\) (Fig. 11) handles the key derivation for the symmetric ratchet for a particular epoch. Specifically, it provides \(\mathsf {key\_seed}\) ’s to \(\varPi _{\textsf{aead}}\) instances that are then expanded to any length using the global random oracle \(\mathcal {F} _{\textsf{pRO}} \). When instructed, it also closes epochs at a certain message number N by generating all \(\mathsf {key\_seed}\) ’s up to N and later disallowing the generation of any further key seeds for its epoch.

Fig. 11.
figure 11

The Message Key Exchange Functionality \(\mathcal {F} _{\textsf{mKE}}\)

Fig. 12.
figure 12

The Epoch Key Exchange Protocol \(\varPi _{\textsf{eKE}} \)

Fig. 13.
figure 13

The Epoch Key Exchange Protocol \(\varPi _{\textsf{eKE}} \) (continued)

Theorem 3

Assume that \(\textsf{PRG}\) is a secure length-doubling pseudorandom generator. Then protocol \(\varPi _{\textsf{mKE}}\) UC-realizes \(\mathcal {F} _{\textsf{mKE}}\) in the presence of \(\mathcal {F} _{\textsf{DIR}},\mathcal {F} _{\textsf{LTM}} ,\) and \(\mathcal {F} _{\textsf{eKE}} ^{\varPi _{\textsf{eKE}}}=(\mathcal {S} _{\textsf{eKE}},\mathcal {F} _{\textsf{eKE}})\).

The Public Ratchet: Realizing \(\boldsymbol{\mathcal {F} _{\textsf{eKE}}}\). By this point we have already described all of the functionalities in our model. As shown in Fig. 1, it only remains to construct real-world protocols that realize each of them. We defer a description of \(\varPi _{\textsf{aead}} \) to the full version [21] and show only the result here.

Theorem 4

Assuming the unforgeability of \((\textsf{MAC}, \textsf{Verify})\), protocol \(\varPi _{\textsf{aead}}\) UC-realizes the ideal functionality \(\mathcal {F} _{\textsf{aead}}\) in the presence of \(\mathcal {F} _{\textsf{pRO}} \), as well as \(F^{\varPi }_{mKE} = (\mathcal {S} _{\textsf{mKE}},\mathcal {F} _{\textsf{mKE}}), \mathcal {F} _{\textsf{eKE}} ^{\varPi _{\textsf{eKE}}}=(\mathcal {S} _{\textsf{eKE}},\mathcal {F} _{\textsf{eKE}}),\mathcal {F} _{\textsf{DIR}},\mathcal {F} _{\textsf{LTM}} ,\mathcal {F} _{\textsf{lib}} \).

Instantiating \(\mathcal {F} _{\textsf{eKE}} \) via \(\varPi _{\textsf{eKE}} \) (Figs. 12 to 13) is subtle. The main challenge, as observed by Alwen et al. [1] and others, is that the key derivation module within the public ratchet must maintain security if either of the previous root key or the newly generated ephemeral keys are uncompromised. Alwen et al. formalized this guarantee by way of constructing a new primitive: a PRF-PRG. In this work, we make two important improvements upon this construction. First, we contribute a new construction called a Cascaded PRF-PRG that allows for equivocation, in order to maintain security against adaptive adversaries. Second, we provide an instantiation in the plain model based on punctured PRFs.

Theorem 5

Assume that \(KDF:\{0,1\}^n \rightarrow \{0,1\}^{2n}\) is a CPRFG, that the DDH assumption holds in the group G. Then protocol \(\varPi _{eKE}\) UC-realizes the ideal functionality \(\mathcal {F} _{\textsf{eKE}} \) in the presence of global functionalities \(\mathcal {F} _{\textsf{DIR}} \) and \(\mathcal {F} _{\textsf{LTM}} \).

Cascaded PRF-PRG. The goal of this primitive is to formalize the requirements required of a key derivation function (KDF) to adhere to the Signal specification [44] in the adaptive setting. We consider a stateful key derivation function (KDF) with two algorithms. Algorithm \(\texttt {Compute} (\mathsf {root\_key},\mathsf {root\_input})=(\mathsf {chain\_key},\mathsf {root\_key} ')\) is given a state \(\mathsf {root\_key} \) and randomizer \(\mathsf {root\_input} \), and computes a chaining key \(\mathsf {chain\_key} \) and an updated state \(\mathsf {root\_key} '\) (discarding the old state). Algorithm \(\texttt {Advance} (\mathsf {root\_key},\mathsf {root\_input})=\mathsf {root\_key} '\) is given a state \(\mathsf {root\_key} \) and randomizer \(\mathsf {root\_input} \), and updates the state for a new epoch.

Our cascaded PRF-PRG definition requires that the KDF be secure against an adversary who can repeatedly execute methods \((\textsf{Compute})\) and \((\textsf{Advance})\) at will, and who can also obtain the module’s local state at any time. More specifically, we require existence of a simulator such that no adversary can distinguish an interaction with the scheme from an ideal interaction (Fig. 14) where the keys are truly random and the exposed state is generated by the simulator.

Fig. 14.
figure 14

Cascaded PRF-PRG Security Game

Definition 1 (Cascaded PRF-PRG (CPRFG))

A KDF \((\texttt {Compute},\texttt {Advance})\) is a cascaded PRF-PRG (CPRFG) if there exist polytime algorithm \(\mathcal {S} \) such that any polytime oracle machine \(\mathcal {A} \) can distinguish between the real and ideal interactions described in Fig. 14 only with advantage that is negligible in n.

In the full version of this work [21], we show two constructions of a Cascaded PRF-PRG: a straightforward one based on a programmable random oracle, and a non-trivial construction in the plain model based on a puncturable PRF and a PRF-PRG (in the style of Alwen et al. [1]).