1 Introduction

Pseudorandom number generators (PRGs) expand a short, uniform bit string s (the “seed”) to a larger sequence of pseudorandom bits X. Beyond their status as a fundamental primitive in cryptography, they are used widely in practical random number generators, including those in all major operating systems. Unsurprisingly, PRGs have been target of many attacks over the years. In this work we focus on a specific, yet prominent, type of PRG attack which arises by planting a backdoor inside the PRG. This type of attack goes far back to 1983, when Vazirani and Vazirani [42, 43] introduced the notion of “trapdoored PRGs” and showed the Blum-Blum-Shub PRG is one such example [13]. Their purpose was not for sabotaging systems, however, but instead they used the property constructively in a higher level protocol.

NIST Dual EC PRG. Perhaps the most infamous demonstration of the potential for sabotage is the backdoored NIST Dual EC PRG [1]. Oversimplifying this example for the sake of presentation (see  [17, 22, 39] for the “real-world” description), the attack works as follows. The (simplified) PRG is parameterized by two elliptic curve points; call them P and Q. These points are supposed to be selected at random and independent from each other, forming the PRG public parameter \(pk=(P,Q)\) which can be reused by multiple PRG instances. Each new PRG instance then selects a random initial seed s, and can expand into random-looking elliptic curve points \(X=sP\) and \(Y=sQ\). Ignoring the details of mapping elliptic curve points into bit-strings,Footnote 1 as well as subsequent iterations of this process, one can conclude that the points (XY) are pseudorandom conditioned on \(pk=(P,Q)\). In fact, this is provably so under to widely believed Decisional Diffie-Hellman (DDH) assumption.

Yet, imagine that the entity selecting points P and Q chooses the second point Q as \(Q=d P\) for a random multiple (“discrete log”) d, and secretly keeps this multiple as its backdoor \(sk=d\). Notice, the resulting public parameter distribution \(pk=(P,Q)\) is identical to the supposed “honest” distribution, when Q was selected independently from P. Thus, the outside world cannot detect any cheating in this step, and could be swayed to use the PRG due to its provable security under the DDH assumption. Yet, the knowledge of d can easily allow the attacker to distinguish the output (XY) from random; or, worse, predict Y from X, by noticing that

$$Y = sQ = s (dP) = d(sP) = dX$$

While we considerably simplified various low level details of the Dual EC PRG, the works of [17, 39] showed that the above attack idea can be extended to attacking the actual NIST PRG. Moreover, the famous “Juniper Dual EC incident” (see [16] and references therein) showed that this vulnerability was likely used for years in a real setting of Juniper Networks VPN system!

Backdoored PRGs. Motivated by these real-world considerations, the work of Dodis et al. [22] initiated a systematic study of so called backdoored PRGs, abstracting and generalizing the Dual EC PRG example from above. A backdoored PRG (KG) is specified by a (unknown to the public) key generation algorithm K which outputs public parameters pk, and a hidden backdoor sk. The “actual PRG” G takes pk and a current PRG state s as input, and generates the next block of output bits R and the updated (internal) state s. The initial seed/state \(s=s_0\) is assumed to be chosen at random and not controlled/sabotaged by the attacker. We call this modeling honest initialization, emphasizing that the Dual EC PRG attack was possible even under such assumption. The PRG can then be iterated any number of times q, producing successive outputs (\(R_i\)) and corresponding internal states (\(s_i\)). The basic constraint on the saboteur is that the joint output \(X=(R_1,\ldots ,R_q)\) should be indistinguishable from uniform given only the public parameters pk (but not the secret backdoor sk). We call this constraint public security.

Unfortunately, the dual EC PRG example shows that public security—even when accompanied by a “security proof”—does not make the backdoor PRG secure against the saboteur, who also knows sk. In fact, [22] showed that the necessary and sufficient assumption for building effective backdoor PRGs (secure to public but broken using sk) is the existence of any public-key encryption scheme with pseudorandom ciphertexts.

1.1 Our Questions: Immunization Countermeasures

While the question of designing backdoored PRGs is fascinating, in this work we are interested in various countermeasures against backdoor PRGs, a topic of interest given the reduced trust in PRGs engendered by the possibility of backdooring. Obviously, the best countermeasure would be to use only trusted PRGs, if this is feasible. Alternatively, one could still agree to use a given backdoor PRG, but attempt to overwrite its public parameters pk. For example, this latter approach is advocated (and formally proven secure) in  [5, 35]. Unfortunately, these techniques cannot be applied in many situations. For example, existing proprietary software or hardware modules may not be easily changed, or PRG choices may be mandated by standards, as in the case of FIPS. Additionally, the user might not have not have direct control over the implementation itself (for example, if it is implemented in hardware or the kernel), or might not have capability or expertise to properly overwrite (potentially hidden or hardwired) value of pk. Fortunately, there is another approach which is much less intrusive, and seems to be applicable to virtually any setting: to efficiently post-process the output of a PRG in an online manner in order to prevent exploitation of the backdoor. We call such a post-processing strategy an immunizer.Footnote 2

The question of building such immunizers was formally introduced and studied by Dodis et al. [22]. For example, the most natural such immunizer would simply apply a cryptographic hash function C, such as SHA-256 (or SHA-3), to the current output \(R_i\) of the PRG, only providing the saboteur with value \(Z_i= C(R_i)\) instead of \(R_i\) itself. The hope being that hashing the output of a PRG will provide security even against the suspected backdoor sk.Footnote 3 Unfortunately, [22] showed that this natural immunizer does not work in general, even if C is modeled as a Random Oracle (RO)! Moreover, this result easily extends to any deterministic immunizer C (e.g., bit truncation, etc.).

Instead, the solution proposed by [22] considers a weaker model of probabilistic/seeded immunizers, where it is assumed that some additional, random-but-public parameter can be chosen after the attacker finalized design of the backdoor PRG (KG), and published the public parameters pk. While [22] provide some positive results for such seeded immunizers, these results were either in the random oracle model, or based on the existence of so called universal computational extractors (UCEs) [9]. Thus, we ask the question:

Question 1

Can one built a seeded backdoor PRG immunizer in the standard model, under an efficiently falsifiableFootnote 4 assumption?

It turns out that we can use the elegant black-box separation technique of Wichs [44] to give a negative answer to this question (proof included in the full version [6]).

Theorem 1

If there is a black-box reduction showing security of a seeded immunizer C from the security of some cryptographic game \(\mathcal {G}\), then \(\mathcal {G}\) is not secure.

Moreover, the availability and trust issues in generating and agreeing on the public seed required for the immunization make this solution undesirable or inapplicable for many settings. Thus, we ask the question if deterministic immunizers could exist in another meaningful model, despite the impossibility result of [22] mentioned above. And, as a secondary question, if they can be based on efficiently falsifiable assumptions.

2-Immunizers to Rescue? We notice that the impossibility result of [22] implicitly (but critically) assumes that only a single honestly-initialized backdoor PRG is being immunized. Namely, the immunizer C is applied to the output(s) \(R_i\) of a single backdoor PRG (KG). Instead, we notice that many PRGs allow to explicitly initialize multiple independent copies. For example, a natural idea would be to initialize two (random and independent) initial states s and \(s'\) of the PRG, run these PRGs in parallel, but instead of directly outputting these outputs \(R_i\) and \(R_i'\), respectively, the (“seedless”) immunizer C will output the value \(Z_i = C(R_i,R_i')\) to the attacker.Footnote 5 We call such post-processing procedures 2-immunizers.Footnote 6 More generally, one can consider k-immunizers for \(k\ge 2\), but setting \(k=2\) is obviously the most preferable in practice. As before, our hope would be that the final outputs \((Z_1,\ldots ,Z_q)\) will be pseudorandom even conditioned on the (unknown) backdoor sk, and even if the key generation algorithm K could depend on the choice of our 2-immunizer C. This is the main question we study in this work:

Question 2

(Main Question). Can one construct a provably secure 2-immunizer C against all efficient backdoored PRGs (KG)?

We note that several natural candidates for such 2-immunizers include XOR, inner product, or a cryptographic hash function C.

A note on immunizers from computational assumptions. One may wonder whether it is worth considering immunizers whose security depends on a computational assumption. After all, if the computational assumption is sufficiently strong to imply that pseudorandom generators exist (as most assumptions are), then why would we not just use the corresponding PRG? However, we think that building a immunizer in this setting is still interesting for two reasons. First, if we can show that a immunizer exists in this regime, then this gives evidence that an information-theoretic style immunizer also exists. Second, there are some scenarios where one has access to PRG outputs but no access to true randomness (for example if the kernel does not give direct access to its random number generator). In this setting, we can use a computational immunizer to recover full security.

1.2 Related Immunization Settings

Before describing our results, it might be helpful to look at the two conceptually similar settings considered by Bauer at al. [8, 21] and Russell et al. [37].

Detour 1: Backdoored Random Oracles. In this model [8], one assumes the existence of a truly random oracle G. However, the fact that G might have been “backdoored” is modeled by providing the attacker with the following leakage oracle any polynomial number of times: given any (potentially inefficient) function g, the attacker can learn the output of g applied to the entire truth-table of G. For example, one can trivially break the PRG security of a length-expanding random oracle \(R=G(s)\), by simply asking the leakage oracle \(g_R(G)\) whether there is a shorter-than-R seed s s.t. \(G(s)=R\).

With this modeling, [8] asked (among other things) whether one can build 2-immunizers for two independent BROs F and G. For example, in case of pseudorandomness, they explicitly asked if \(H(s) = F(s)\oplus G(s)\) is pseudo-random (for random seed s), even if the distinguisher can have polynomial number of leakage oracle calls to F and G separately (but not jointly). Somewhat surprisingly, they reduce this question to a plausible conjecture regarding communication complexity of the classical set-intersection problem (see [15] for a survey of this problem). Thus, despite not settling this question unconditionally, the results of [8] suggest that XOR might actually work for the case of PRGs.

In addition, [38] studies the question of k-immunizers in the related setting of “subverted" random oracles (where the subverted oracle differs from the true one on a small number of inputs). There, a simple yet slightly more complicated “xor-then-hash" framework is shown to provide a good immunizer.

Detour 2: Kleptographic Setting. While the study of kleptography goes back to the seminal works of Young and Yung [45,46,47] (and many others), let us consider a more recent variant of [37]. This model is quite general, attempting to formalize the ability of the public to test if a given black-box implementation is done according to some ideal specification. As a special case, this could in particular cover the problem of public parameter subversion of PRGs, where the PRG designer kept some secret information sk, instead of simply choosing pk at random.

We will comment on the subtleties “kleptographic PRGs” vs “backdoored PRGs” a bit later, but remark that [37] claimed very simple k-immunizers in their setting. Specifically they showed that for one-shot PRGs (where there is no internal state for deriving arbitrarily many pseudorandom bits) in the kleptographic setting, random oracle C is a good 2-immunizer, while for \(k\gg 2\), one can even have very simple k-immunizers in the standard model. For example, have each of k PRGs shrink its output to a single bit, and then concatenate these bits together. Again this suggests that something might work for the more general case of (stateful) PRGs.

1.3 Our Results for 2-Immunizers

As we see, in both of these related settings it turns out that simple k-immunizers exist, including XOR and random oracle for \(k=2\). Can these positive results be extended to the backdoored PRG setting?

XOR is Insecure. First we start with the simple XOR 2-immunizer \(C(x,y)= x\oplus y\), which is probably the simplest and most natural scheme to consider. Moreover, as we mentioned, the PRG results of [8] for BROs give some supporting evidence that this 2-immunizer might be secure in the setting of backdoor PRGs. Unfortunately, we show that this is not the case.Footnote 7 Intuitively, the BRO modeling assumes that both generators F and G are modeled as true random oracles with bounded leakage, which means that both of them have a lot of entropy hidden from the attacker. In contrast, the backdoor PRG model of [22] (and this work) allows the attacker to build F and G which are extremely far from having any non-trivial amount of entropy to the attacker who knows the backdoor sk.

Indeed, our counter-example for the XOR immunizer comes from a more general observation, which rules out all 2-immunizers C for which one can build a public key encryption scheme \(({{\,\textrm{Enc}\,}},{{\,\textrm{Dec}\,}})\) which has pseudorandom ciphertexts, and is what we call C-homomorphic. Oversimplifying for the sake of presentation (see Definition 13), we need an encryption scheme where the message m—independently encrypted twice under the same public key pk with corresponding ciphertexts x and y—can still be recovered using the secrete key sk and “C-combined” ciphertext \(z=C(x,y)\). If such a scheme exists, the backdoor PRG can simply output independent encryptions of a fixed message (say, 0) as its pseudorandom bits. The C-homomorphic property then ensures that the attacker can still figure that 0 was encrypted after seeing the combined ciphertext \(z= C(x,y)\), where x and y are now (individually pseudorandom, and hence secure to public) encryptions of 0. Moreover, we build a simple “XOR-homomorphic” public key encryption under a variant of the LPN assumption due to Alekhnovich [3]. Thus, under this assumption we conclude that XOR is not a secure 2-immunizer.

Theorem 2

Assuming the Alekhnovich assumption (listed in Proposition 1) holds, XOR is not a secure 2-immunizer.

Inadequacy of Kleptographic Setting for PRGs. Our second observation is that the kleptographic setting considered by [37]—which extremely elegant and useful for many other cryptographic primitives (and additionally considers the dimension of corrupted implementations, which we do not consider) – does not adequately model the practical problem of backdoored PRGs. In essence, the subverted PRG modeling of [36, 37] yields meaningful results in the stateless (one-time output production) setting, but does not extend to the practically relevant stateful setting. It is worth noting that while [37] informally claim (see Remark 3.2 in [36]) a trivial composition theorem to move from stateless to the (practically relevant) stateful setting, that result happens to be vacuous.Footnote 8 In particular, the “ideal specification” of stateful PRGs (implicitly assumed by the authors in their proofs) requires that stateful PRG would produce fresh and unrelated outputs, even after rewinding the PRG state to some prior state. However, PRGs are deterministic after the initial seed is chosen. As such, even the most secure and “stego-free” implementation will never pass such rewinding test, as future outputs are predetermined once and for all. Stated differently, the “ideal specification” of stateful PRG implicitly assumed by [36, 37] in Remark 3.2 is too strong, and no construction can meet it.Footnote 9

To see this modeling inadequacy directly, recall that one of the standard model k-immunizers from [36, 37] simply concatenates the first bit of each PRG’s output. For a stateless (one-time) PRG case, this is secure for trivial (and practically useless) reasons: each PRG bit should be statistically random, or the “public” (called the “watchdog” by the authors) will easily catch it. But now let us look at the stateful extension,—which could be potentially useful if it was secure,—and apply it to the the following Dual-EC variant. On a given initial state s, in round i the variant will output the ith bit of Dual-EC initialized with s. Syntactically, this is the same (very dangerous) backdoor PRG we would like to defend against, although made artificially less efficient. Yet, when the “concatenation” k-immunizer above is applied to this (stateful) variant, the attacker still learns full outputs of each of the k PRG copies, and can just do the standard attack on Dual-EC separately on each copy. This means that this k-immunizer is blatantly insecure in our setting, for any value of k.

Random Oracle is Secure. Despite the inability to generically import the positive results of [36, 37] to our setting, we can still ask if the random oracle 2-immunizer result claimed by [36, 37] is actually true for backdoored PRGs. Fortunately, we show that this is indeed the case, by giving a direct security proof.Footnote 10 In fact, it works even is the so called auxiliary-input ROM (AI-ROM) defined by Unruh [41] and recently studied by [19, 23]. In this model we allow the saboteur to prepare the backdoor sk and public parameters pk after unbounded preprocessing of the Random Oracle C. The only constraint of the resulting backdoored PRG G is that it has to be secure to the public in the standard ROM (since the public might not have enough resources to run the expensive preprocessing stage). Still, when being fed with outputs \(z_i=C(x_i,y_i)\), the saboteur cannot distinguish them from random even given its polynomial-sized backdoor sk (which also models whatever auxiliary information about RO C the attacker computed), and additional polynomial number of queries to C.

Despite appearing rather expected, the proof of this result is quite subtle. It uses the fact that each independently initialized PRG instances F and G are unlikely to ever query the random oracle on any of the outputs produced by the other instance (i.e., F on \(C(\cdot , y_i)\) and G on \(C(x_i,\cdot )\)), because we show that this will contradict the assumed PRG security of F and G from the public.

Theorem 3

\(C(X, Y) = RO(X||Y)\) is a secure 2-immunizer in the AI-ROM.

Back-box Separation From Efficiently Falsifiable Assumptions. Finally, we consider the question of building a secure 2-immunizer in the standard model. In this setting, we again use the black-box separation technique of Wichs [44] to show the following negative result. No function C(xy), which is highly dependent on both inputs x ad y, can be proven as a secure 2-immunizer for backdoor PRGs, via a black-box reduction to any efficiently falsifiable assumption.

The formal definition of “highly dependent” is given in Definition 18, but intuitively states that there are few “influential” inputs \(x^*\) (resp., \(y^*\)) which fix the output of C to a constant, irrespective of the other input. We notice that most natural functions are clearly highly dependent on both inputs. This includes XOR, the inner product function, and any cryptographic hash function heuristically replacing a random oracle, such as SHA-256 or SHA-3.

The latter category is unfortunate, though. While our main positive result gave plausible evidence that cryptographic hash functions are likely secure as 2-immunizers, our negative result shows that there is no efficiently falsifiable assumption in the standard model under which we can formally show security of any such 2-immunizer C.

Theorem 4

Let C be a 2-immunizer which is highly dependent on both inputs. If there is a black-box reduction showing that C is secure from the security of some cryptographic game \(\mathcal {G}\), then \(\mathcal {G}\) is not secure.

Weak 2-Immunizers. Given our main positive result is proven in the random oracle model, we also consider another meaningful type of immunizer which we call weak 2-immunizer, in hope that it might be easier to instantiate in the standard model. (For contrast, we will call the stronger immunizer concept considered so far as strong 2-immunizer.) Recall, in the strong setting the immunizer C was applied to two independently initialized copies of the same backdoor PRG (KG). In particular, both copies shared the same public parameters pk. In contrast, in the weak setting,—in addition to independent seed initialization above,—we assume the backdoor PRGs were designed by two independent key generation processes K and K’, producing independent key pairs (pksk) and \((pk',sk')\). For example, this could model the fact that competing PRGs were designed by two different standards bodies (say, US and China). Of course, at the end we will allow the two saboteurs to “join forces” and try to use both sk and \(sk'\) when breaking the combined outputs \(Z_i= C(R_i,R_i')\). Curiously, it is not immediately obvious that a strong 2-immunizer is also a weak one, but we show that this is indeed the case, modulo a small security loss. In particular, this implies that our positive result in the random oracle model also gives a weak 2-immunizer.

Of course, the interesting question is whether the relaxation to the weak setting makes it easier to have standard model instantiations. Unfortunately, we show that this does not appear to be the case, by extending most of our impossibility/separation results to the weak setting (as can be seen in their formal statements). The only exception is the explicit counter-example to the insecurity of XOR as a weak 2-immunizer, which we leave open (but conjecture to be true). As partial evidence, we show that the pairing operation (which looks similar to XOR) is not a weak 2-immunizer under a widely believed SXDH assumption in pairing based groups [4, 7].

Theorem 5

Assuming the SXDH assumption (listed in Conjecture 1) holds for groups \(G_X, G_Y, G_T\), a bilinear map \(e:G_X \times G_Y \rightarrow G_T\) is not a secure weak 2-immunizer.

Open Question. Summarizing, our results largely settle the feasibility of designing secure 2-immunizers for backdoor PRGs, but leave the following fascinating question open: Is there a 2-immunizer C in the standard model whose security can be black-box reduced to an efficiently falsifiable assumption?

While we know such C cannot be “highly dependent on both inputs”, which rules out most natural choices one would consider (including cryptographic hash function), we do not know if other “unnatural” functions C might actually work.

In the absence of such a function/reduction, there are two alternatives:

First, it may be possible to give a non-black-box reduction from a non-highly input-dependent function (such as a very good two-source extractor).

Or alternatively, one might try to base the security of C on a non-falsifiable assumption likely satisfied by a real-world cryptographic hash function. For example, [22] built seeded 1-immunizers based on the existence of so called universal computational extractors (UCEs) [9]. Unfortunately, the UCE definition seems to be inherently fitted for 1-immunizers, and it is unclear (and perhaps unlikely) that something similar can be done in the 2-immunizer setting, at least with a security definition that is noticeably simpler than that of 2-immunizers.

1.4 Further Related Work

We briefly mention several related works not mentioned so far.

Extractors. Randomness Extractors convert a weak random source into an output which is statistically close to uniform. Similar to our setting, while deterministic extraction is impossible in this generality [18], these results can either be overcome using seeded extractors [31], or two-source extractors [18].

A special class of seeded extractors consider consider sources which could partially depend on the prior outputs of the extractor (and, hence, indirectly on the random seed). Such sources are called extractor-dependent [25, 33], and generalize the corresponding notion of oracle-dependent extractors considered by [20] in the ROM. Conceptually similar to our results, [25] showed a black-box separation for constructing such extractors from cryptographic hash functions in the standard model, despite the fact that cryptographic hash functions provably worked in the ROM [20].

Kleptography. Young and Yung studied what they called kleptography: subversion of cryptosystems by modifying encryption algorithms in order to leak information subliminally [45,46,47]. Juels and Guajardo [29] propose an immunization scheme for kleptographic key-generation protocols that involves publicly-verifiable injection of private randomness by a trusted entity. More recent work by Bellare, Paterson, and Rogaway [10] treats a special case of Young and Yung’s setting for symmetric encryption.

As described in detail above, the works [36, 37] consider the idea of using a random oracle as a 2-immunizer, however their results do not extend to the stateful setting considered here.

The works [5, 34] also consider immunizing corrupted PRGs, however these results success by modifying the public parameters, as opposed to operating on the PRG output. In other words, the immunizers are not simple and stateless, and thus not relevant in a situations where a user cannot control the implementation itself (e.g. if it is implemented in hardware or the kernel).

Steganography and Related Notions. Steganography (see [27, 40]) is the problem of sending a hidden message in communications over a public channel so that an adversary eavesdropping on the channel cannot even detect the presence of the hidden message. In this sense backdoor PRG could be viewed as a steganographic channel where the PRG is trying to communicate information back to the malicious PRG designer, without the “public” being able to detect such communication (thinking instead that a random stream is transmitted).

More recently, the works of  [28, 32] looked at certain types of encryption schemes which can always be turned into stegonagraphic channels, even if the dictator demands the users to reveal their purported secret keys.

Finally, the works of [24, 30] looked at constructing so called reverse firewalls, which probably remove steganographic communication by carefully re-randomizing messages supposedly exchanged by the parties for some other cryptographic task.

Backdoored Random Oracles. The work of [8] and [12] consider the task of immunizing random oracles with XOR. However, these consider information theoretic models of PRG security. An intriguing observation about the findings of our work is that information theoretic models (such as the backdoored random oracle model) do not capture the computational advantage that backdoors can achieve, as is shown by our counterexamples in Sect. 3.

2 Definitions

Definition 1

Two distributions X and Y are called (denoted by \({\textbf {CD}}_t(X, Y) \le \epsilon \)) if for any algorithm D running in time t,

$$\left| \Pr [D(X) = 1] - \Pr [D(Y) = 1]\right| \le \epsilon .$$

Definition 2

Let \(X_{\lambda }\) and \(Y_{\lambda }\) be two families of distributions indexed by \(\lambda \). If for all polynomial \(t(\lambda )\) and some negligible \(\epsilon (\lambda )\), \(X_{\lambda }\) and \(Y_{\lambda }\) are \((t(\lambda ), \epsilon (\lambda ))\)-indistinguishable, then we say X and Y are computationally indistinguishable (denoted by \({\textbf {CD}}(X, Y)\le negl(\lambda )\)).

2.1 Pseudorandom Generators

A pseudorandom generator is a pair of algorithms (KG). Traditionally, K takes in randomness and outputs a public parameter. We additionally allow K to output a secret key to be used for defining trapdoors. To go with our notation of secret keys, we will denote the public parameter as the public key. For non-trapdoored PRGs, the secret key is set to null. G is a function that takes in a public key and a state, and outputs an n-bit output as well as a new state. More formally, we give the following definitions, adapted from [22]:

Definition 3

Let \(\mathcal{P}\mathcal{K}, \mathcal{S}\mathcal{K}\) be sets of public and secret keys respectively. Let \(\mathcal {S}\) be a set we call the state space. A pseudorandom generator (PRG) is a pair of algorithms (KG) where

- \(K:\{0, 1\}^\ell \rightarrow \mathcal{P}\mathcal{K}\times \mathcal{S}\mathcal{K}\) takes in randomness and outputs a public key \(pk\) and secret key \(sk\). We will denote running K on uniform input as \((pk, sk) \xleftarrow {\$}K\).

- \(G:\mathcal{P}\mathcal{K}\times \mathcal {S}\rightarrow \{0, 1\}^n \times \mathcal {S}\) takes in the public key and a state and outputs n bits as well as the new state.

For ease of notation, we may write G instead of \(G_{pk}\) when the public key is clear from context.

Definition 4

Let (KG) be a PRG, \(pk\in \mathcal{P}\mathcal{K}, s \in \mathcal {S}\). Let \(s_0 = s\) and let \((r_i, s_i) \leftarrow G_{pk}(s_i)\) for \(i \ge 1\). We call the sequence \((r_1, \dots , r_q)\) the , and denote it by \({\textbf {out}}^q( G_{pk}, s )\) (or \({\textbf {out}}^q(G, s)\)).

For n an integer we will denote by \(\mathcal {U}_n\) the uniform distribution over \(\{0,1\}^n\).

Definition 5

A PRG (KG) is a if K, G both run in time t and

$$pk\xleftarrow {\$}K$$
$${\textbf {CD}}_t((pk, {\textbf {out}}^q(G_{pk}, \mathcal {S})), (pk, \mathcal {U}_{qn})) \le \delta .$$

Note that here there is some implied initial distribution over \(\mathcal {S}\). This will depend on the construction, but when unstated we will assume that this distribution is uniform.

Definition 6

A PRG (KG) is a if K, G both run in time t and

$$(pk, sk) \xleftarrow {\$}K$$
$${\textbf {CD}}_{t}((pk, sk, {\textbf {out}}^q(G_{pk}, \mathcal {U}_{\mathcal {S}})), (pk, sk, \mathcal {U}_{qn})) \le \delta .$$

Note that there are PRGs that are \((t, q, \delta )\) publicly secure, but not \((t', q, \delta ')\) backdoor secure even for some \(t' << t\) and \(\delta ' >> \delta \) [22]. The goal of an immunizer is to take in as input some (KG) which is publicly secure but not backdoor secure, and transform it generically into a new PRG which is backdoor secure.

2.2 2-Immunizers

Our definition of 2-immunizers will also be based on the definition of immunizers given in [22]. Note in particular that while the [22] definition of immunizers takes in the output of one PRG and a random seed, we define 2-immunizers to be deterministic functions of the output of two PRGs.

We first define notation to express what it means to apply an immunizer to two PRGs.

Definition 7

Let \((K^X, G^X), (K^Y, G^Y)\) be two PRGs and let \(C:\{0,1\}^{n} \times \{0,1\}^{n}\rightarrow \{0,1\}^m\) be a function on the output spaces of the PRGs. We define a new PRG as follows:

-The key generation algorithm (denoted \((K^X, K^Y)\)) will be the concatenation of the original two key generation algorithms. More formally, it will run \(K^X \rightarrow pk^X, sk^X\), \(K^Y \rightarrow pk^Y, sk^Y\) and will return \(pk= (pk^X, pk^Y)\) and \(sk= (sk^X, sk^Y)\).

-The pseudorandom generation algorithm, denoted \(C(G^X, G^Y)\) will run both PRGs independently and apply C to the output. Formally, let us denote \(s = (s^X, s^Y)\). If \(G^X(s^X) = (r^X, s'^X)\) and \(G^Y(s^Y) = (r^Y, s'^Y)\), then

$$C(G^X, G^Y)(s) := (C(r^X, r^Y), (s'^X, s'^Y)).$$

Note that the output of the PRG will be C applied to the outputs of the original PRGs. Formally, if \({\textbf {out}}^q(G^X, s^X) = x_1,\dots , x_q\) and \({\textbf {out}}^q(G^Y, s^Y) = y_1, \dots , y_q\), then

$${\textbf {out}}^q(C(G^X, G^Y), (s^X, s^Y)) = C(x_1,y_1), \dots , C(x_q, y_q).$$

Definition 8

A two-input function C is a , if for any \((t, q, \delta )\) publicly secure PRGs \((K^X, G^X), (K^Y, G^Y)\), the PRG

\(((K^X, K^Y), C(G^X, G^Y))\) is a \((t, q, \delta ')\) backdoor secure PRG.

A weak 2-immunizer is effective at immunizing two PRGs as long as the public parameters are independently sampled. We can also consider the case where the designers of the two PRGs collude and share public parameters. Identically, we can consider the case where we run one backdoored PRG on multiple honest initializations. If a 2-immunizer effectively immunizes in this setting, we call it a strong 2-immunizer.

Let us first define the syntax

Definition 9

Let (KG) be a PRG and let \(C:\{0,1\}^n\times \{0,1\}^n \rightarrow \{0,1\}^m\) be a function on the output space of G. We define a new PRG (denoted (KC(GG))) as follows:

-The key generation algorithm will be K

-The pseudorandom generation algorithm, denoted \(C(G_{pk}, G_{pk})\) will run G twice (with the same public key) on two initial seeds, and apply C to the output. Formally, let us denote \(s = (s^X, s^Y)\). If \(G_{pk}(s^X) = (r^X, s'^X)\) and \(G_{pk}(s^Y) = (r^Y, s'^Y)\), then

$$C(G, G)(s) := (C(r^X, r^Y), (s'^X, s'^Y))$$

If \(x_1,\dots ,x_q = {\textbf {out}}^q(G_{pk}, s^X)\) and \(y_1,\dots ,y_q = {\textbf {out}}^q(G_{pk}, s^Y)\) are two outputs of G on the same public key and freshly sampled initial states, then

$${\textbf {out}}^q(C(G, G), (s^X, s^Y)) = C(x_1,y_1),\dots ,C(x_q,y_q).$$

Definition 10

A two-input function C is called a , if for any \((t, q, \delta )\) publicly secure PRG (KG), the PRG

(KC(GG)) is a \((t, q, \delta ')\) backdoor secure PRG.

Lemma 1

If C is a \((t, q, \delta , \delta ')\)-secure strong 2-immunizer, then C is a

\((t, q, \delta , 4\delta ')\)-secure weak 2-immunizer.

For a proof of this lemma, see a full version of this paper [6].

Remark 1

Some traditional definitions of PRGs [11] consider the notion of forward-secrecy. That is, even PRG security for the first q outputs should still be maintained even if the \(q+1\)st output is leaked. However, it is impossible for a 2-immunizer in our model to preserve public forward secrecy. Informally, given any PRG satisfying forward-secrecy, we can append an encryption of the initial state to the \(q+1\)st state. This would result in a PRG satisfying public forward-secrecy but not backdoor forward-secrecy. Since we do not allow the 2-immunizer to view or modify the internal state of the corresponding PRGs in any way, it is impossible for any 2-immunizer to remove this vulnerability.

3 Counterexamples for Simple 2-Immunizers

In this section we will outline a framework for arguing that simple functions (for example XOR) do not work as 2-immunizers. To argue that some C is not a strong 2-immunizer, we will construct a public key encryption scheme suitably homomorphic under C. We will then note that the PRG which simply encrypts 0 using the randomness of its honest initialization will have a backdoor after immunization, where the backdoor will be given by the homomorphic property of the underlying encryption scheme.

To argue that C is not a weak 2-immunizer, we will need to instead construct two public key encryption schemes which are in jointly homomorphic in a suitable manner. In this case, the PRGs defined by encrypting 0 under the two public key encryption schemes defined will allow us to perform an analogous attack on C.

In particular, we will generically define what it means for public key encryption schemes to be suitably homomorphic under C, and argue that this property is enough to show that C is not a 2-immunizer. Note that the definition of suitably homomorphic will depend on whether we are attacking the weak or strong security of C.

We will then instantiate our generic result with specific public key encryption schemes, leading to the following theorems.

Theorem 6

(Theorem 2 restated). Assuming the Alekhnovich assumption (listed in Proposition 1) holds, XOR is not a \((poly(\lambda ), 1, negl(\lambda ), negl(\lambda ))\)-secure strong 2-immunizer.

Note that there is no simple way to adapt the public key encryption scheme used to prove this theorem to be sufficiently homomorphic to prove that XOR is not a weak 2-immunizer. We leave the question as to whether XOR is a weak 2-immunizer as an open question.

Definition 11

Let \(G_X, G_Y, G_T\) be groups of prime order exponential in \(\lambda \) with generators \(g_X, g_Y, g_T\). A bilinear map \(e:G_X\times G_Y \rightarrow G_T\) is a function satisfying

$$e(g_X^a, g_Y^b) = e(g_X, g_Y)^{ab} = g_T^{ab}$$

Note that requiring \(e(g_X, g_Y) = g_T\) is a non-standard requirement for bilinear maps, but will always occur when we restrict the codomain of the bilinear group to the subgroup defined by its image.

Theorem 7

(Theorem 5 restated). Assuming the SXDH assumption (listed in Conjecture 1) holds for groups \(G_X, G_Y, G_T\), a bilinear map \(e:G_X \times G_Y \rightarrow G_T\) is not a \((poly(\lambda ), 2, negl(\lambda ),\)

\(negl(\lambda ))\)-secure weak 2-immunizer.

Note that although [8] does not directly argue that a bilinear map is a 2-immunizer in their model, it is clear that the argument for XOR can be generalized to apply for bilinear maps.

3.1 Public Key Encryption

A public key encryption scheme (PKE) is a triple \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\) where

  • \({{\,\textrm{Gen}\,}}\) outputs a public key, secret key pair \((pk, sk)\),

  • \({{\,\textrm{Enc}\,}}\) takes in the public key \(pk\) and a message m, and outputs a ciphertext c,

  • \({{\,\textrm{Dec}\,}}\) takes in the secret key \(sk\) and a ciphertext c, and outputs the original message m.

For security, as we are working with pseudorandom generators, it is useful for us to require that the encryption schemes themselves be pseudorandom. More formally,

Definition 12

We say that a public key encryption scheme \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\) is pseudorandom if for all m,

$$pk\xleftarrow {\$}{{\,\textrm{Gen}\,}}$$
$${\textbf {CD}}_{poly(\lambda )}((pk, {{\,\textrm{Enc}\,}}(m)), (pk, \mathcal {U})) \le negl(\lambda )$$

Note that for our purposes we will require all public key encryption schemes to be pseudorandom. We remark that this assumption is strictly stronger than traditional PKE security.

3.2 Strong 2-Immunizers

Definition 13

Let \(C^{e}:\{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\}^m\) be some operation. We say that a public key encryption scheme \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\) is if there exists some function \({{\,\textrm{Dec}\,}}_{sk}^{C^{e}}\) such that for all m,

figure h

Theorem 8

Let \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\) be a public key encryption scheme and let \(C^{e}\) be some operation. Then, if \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\) is pseudorandom and \(C^{e}\)-homomorphic (with homomorphic decryption algorithm \({{\,\textrm{Dec}\,}}^{C^{e}}\)), then \(C^{e}\) is not a \((poly(\lambda ), 1, negl(\lambda ), negl(\lambda ))\)-secure strong 2-immunizer.

Proof

We will first construct a PRG (KG) using \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\), and then we will show that \(C^{e}(G, G)\) has a backdoor.

Let us first observe that \(\Pr [{{\,\textrm{Dec}\,}}^{C^{e}}(\mathcal {U}) \rightarrow 0] + \Pr [{{\,\textrm{Dec}\,}}^{C^{e}}(\mathcal {U}) \rightarrow 1] \le 1\), and so one of these probabilities will be less than \(\frac{1}{2}\). Without loss of generality, assume \(\Pr [{{\,\textrm{Dec}\,}}^{C^{e}}(\mathcal {U}) \rightarrow 0] \le \frac{1}{2}\).

Define (KG) by \(K := {{\,\textrm{Gen}\,}}\), \(G_{pk}(s) := {{\,\textrm{Enc}\,}}_{pk}(0; s)\). It is clear to see that (KG) is a \((poly(\lambda ), 1, negl(\lambda ))\) publicly secure PRG by the definition of a pseudorandom PKE. Thus, it remains to show an adversary D that can distinguish

$$(pk, sk, C^{e}({{\,\textrm{Enc}\,}}_{pk}(0; \mathcal {U}), {{\,\textrm{Enc}\,}}_{pk}(0; \mathcal {U})))$$

from

$$(pk, sk, \mathcal {U})$$

with probability \(\ge \frac{1}{poly(\lambda )}\).

On input \((pk, sk, r)\), D will run \({{\,\textrm{Dec}\,}}_{sk}^{C^{e}}(r) \rightarrow m\) and output 1 if and only if \(m = 0\). It is clear that

$$\Pr [D(pk, sk, C^{e}({{\,\textrm{Enc}\,}}_{pk}(0; \mathcal {U}), {{\,\textrm{Enc}\,}}_{pk}(0; \mathcal {U}))) \rightarrow 1] \ge \frac{2}{3}$$

by the definition of \({{\,\textrm{Dec}\,}}^{C^{e}}\). But note that we assumed \(\Pr [{{\,\textrm{Dec}\,}}^{C^{e}}(\mathcal {U}) \rightarrow 0] \le \frac{1}{2}\), and so

$$\Pr [D(pk, sk, \mathcal {U}) \rightarrow 1] \le \frac{1}{2}$$

Thus, the advantage of D is \(\ge \frac{2}{3} - \frac{1}{2} = \frac{1}{6} \ge \frac{1}{poly(\lambda )}\)

We remark that while this theorem is stated for \(q = 1\), it is fairly easy to extend this to arbitrary q by simply appending the corrupted PRGs with a genuine one.

[3] gives a construction of a public key encryption scheme based off of a variant of the learning parity with noise problem (which we will call the Alekhnovich assumption, it is Conjecture 4.7 in his paper). Instead of presenting his underlying assumption directly, we will refer to the following proposition:

Proposition 1

[3]: Suppose that the Alekhnovich assumption holds, then for every \(m = O(n), k = \varTheta (\sqrt{n})\), \(\ell , t \le poly(n)\) then

$$A_i \xleftarrow {\$}\mathcal {U}_{m \times n}, x_i \xleftarrow {\$}\mathcal {U}_n, e_i \xleftarrow {\$}\left( {\begin{array}{c}\{0, 1\}^m\\ k\end{array}}\right) $$
$${\textbf {CD}}_{t}((A_i, A_ix_i + e_i)_{i = 1}^\ell , (A_i, \mathcal {U}_{m})_{i = 1}^\ell ) \le negl(n)$$

That is, given a uniformly random \(m \times n\) binary matrix A, a vector which differs from an element in the image of the matrix in exactly k places is computationally indistinguishable from random.

Let us proceed now to the proof of Theorem 6.

We will prove this by showing a pseudorandom \(\oplus \)-homomorphic public key encryption scheme based off of the Alekhnovich assumption.

We claim that if the Alekhnovich assumption holds, the public key encryption scheme presented in [2] (along with a minor variation) is both pseudorandom and \(\oplus \)-homomorphic. Therefore, by Theorem 8, XOR is not a strong 2-immunizer.

First, we present Alekhnovich’s public key encryption scheme in Fig. 1. We make one minor change to the original scheme, namely we change the value of the parameter k from \(\sqrt{\frac{n}{2}}\) to \(\sqrt{\frac{n}{4}}\). Note that since the underlying proposition only requires that \(k = \varTheta (\sqrt{n})\), this does not affect the proof of security

Fig. 1.
figure 1

Alekhnovich’s PKE scheme (From Sect. 4.4.3).

Proposition 2

[3]: Assuming the Alekhnovich assumption holds,

$${\textbf {CD}}((pk, {{\,\mathrm{Enc-A}\,}}(0)), (pk, {{\,\mathrm{Enc-A}\,}}(1))) \le negl(\lambda )$$

Corollary 1

Assuming the Alekhnovich assumption holds, \(({{\,\mathrm{Gen-A}\,}}, {{\,\mathrm{Enc-A}\,}},\) \({{\,\mathrm{Dec-A}\,}})\) is pseudorandom.

Proposition 3

Assuming the Alekhnovich assumption holds, \(({{\,\mathrm{Gen-A}\,}}, {{\,\mathrm{Enc-A}\,}},\) \({{\,\mathrm{Dec-A}\,}})\) as presented above is \(\oplus \)-homomorphic.

The proof of Proposition 3 is in the full version of this paper [6].

3.3 Weak 2-Immunizers

Definition 14

Let \(C^{e}:\{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\}^m\) be some operation. We say a pair of public key encryption schemes \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\) and \(({{\,\textrm{Gen}\,}}', {{\,\textrm{Enc}\,}}', {{\,\textrm{Dec}\,}}')\) are if there exists some function \({{\,\textrm{Dec}\,}}_{sk,sk'}^{C^{e}}\) such that for all m,

figure j

Theorem 9

Let \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\), \(({{\,\textrm{Gen}\,}}', {{\,\textrm{Enc}\,}}', {{\,\textrm{Dec}\,}}')\) be two public key encryption schemes and let \(C^{e}\) be some operation. Then, if \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\),

\(({{\,\textrm{Gen}\,}}', {{\,\textrm{Enc}\,}}', {{\,\textrm{Dec}\,}}')\) are pseudorandom and jointly \(C^{e}\)-homomorphic (with homomorphic decryption algorithm \({{\,\textrm{Dec}\,}}^{C^{e}}\)), then \(C^{e}\) is not a \((poly(\lambda ), 1, negl(\lambda ),\)

\(negl(\lambda ))\)-secure weak 2-immunizer.

Proof

This proof is analogous to the proof of Theorem 8. The corresponding PRGs are \((K^X, G^X) = ({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}(0; s))\) and \((K^Y, G^Y) = ({{\,\textrm{Gen}\,}}', {{\,\textrm{Enc}\,}}'(0; s))\). The distinguisher again runs \({{\,\textrm{Dec}\,}}^{C^{e}} \rightarrow 0\) and returns 1 if and only if \(m = 0\).

Corollary 2

If there exists \(({{\,\textrm{Gen}\,}}, {{\,\textrm{Enc}\,}}, {{\,\textrm{Dec}\,}})\), \(({{\,\textrm{Gen}\,}}', {{\,\textrm{Enc}\,}}', {{\,\textrm{Dec}\,}}')\) pseudorandom and jointly \(\oplus \)-homomorphic, then \(\oplus \) is not a \((poly(\lambda ), 1,\)

\(negl(\lambda ),negl(\lambda ))\)-secure weak 2-immunizer.

We remark that the Alekhnovich PKE is not jointly \(\oplus \)-homomorphic with itself. We leave it as an open question as to whether such a pair of encryption schemes exist for XOR, but we suspect that its existence is likely.

Instead, we show that another simple 2-immunizer (namely a bilinear pairing) is not secure assuming a suitable computational assumption. In particular, we will rely on the SXDH assumption, defined in [4, 7].

Conjecture 1

The Symmetric External Diffie Hellman Assumption (SXDH) states that there exist groups \(G_X, G_Y, G_T\) with generators \(g_X, g_Y, g_T\) such that -there exists an efficiently computable bilinear map \(e:G_X \times G_Y \rightarrow G_T\), -for uniformly random \(a,b,c\xleftarrow {\$}\mathbb {Z}_{|G_X|}\) \({\textbf {CD}}((g_X^a, g_X^b, g_X^{ab}), (g_X^a, g_X^b, g_X^c)) \le negl(\lambda )\) (the Diffie Hellman assumption holds for \(G_X\)), -for uniformly random \(a,b,c\xleftarrow {\$}\mathbb {Z}_{|G_Y|}\) \({\textbf {CD}}((g_Y^a, g_Y^b, g_Y^{ab}), (g_Y^a, g_Y^b, g_Y^c)) \le negl(\lambda )\) (the Diffie Hellman assumption holds for \(G_Y\)).

Note that, as stated in Definition 11 we will require that \(e(g_X, g_Y) = g_T\) and that \(G_X, G_Y, G_T\) are of prime order exponential in \(\lambda \).

Note that instead of constructing jointly homomorphic public key encryption schemes under e, we will instead create public key encryption schemes jointly homomorphic under a related operation. We will then use the fact that this related operation is not a weak 2-immunizer to show that e is not a weak 2-immmunizer.

Let \(G_X, G_Y, G_T\) be cyclic groups of size exponential in \(\lambda \) with an efficiently computable bilinear map \(e:G_X \times G_Y \rightarrow G_T\). Define the 2-immunizer \(C^{e}:(G_X \times G_X) \times (G_Y \times G_Y) \rightarrow G_T\) by

$$C^{e}((a_X, b_X), (a_Y, b_Y)) = (e(a_X, b_X), e(a_Y, b_Y)).$$

Lemma 2

Assuming the SXDH assumption holds, \(C^{e}\) is not a \((poly(\lambda ), 1,\)

\(negl(\lambda ), negl(\lambda ))\)-secure weak 2-immunizer.

We defer the proof of this lemma and the proof of Theorem 7 using Lemma 2 to the full version [6].

4 Positive Result in Random Oracle Model

Although it seems that simple functions will not function well as a 2-immunizer, we show that a random oracle is a strong 2-immunizer. Heuristically, this means that a good hash function can be used in practice as a 2-immunizer. Furthermore, it gives some hope that 2-immunizers may exist in the standard model.

In fact, a random oracle is a strong 2-immunizer even if we allow the adversary to perform arbitrary preprocessing on the random oracle. This model, introduced in [41], is known as the Auxiliary Input Random Oracle Model (AI-ROM).

Theorem 10

Let \(RO:\{0,1\}^{2n} \rightarrow \{0,1\}^{m}\) be a random oracle. For t sufficiently large to allow for simple computations, \(f(X, Y) = RO(X || Y)\) is a \((t, q, \delta , \delta ')\)-secure strong 2-immunizer with

$$\delta ' = \left(\delta + \frac{q^2}{2^n}\right) + 2(t+t^2)q\sqrt{\delta + \frac{q}{2^n}}.$$

Corollary 3

\(f(X, Y) = RO(X||Y)\) is a \((poly(\lambda ), poly(\lambda ), negl(\lambda ), negl(\lambda ))\)-secure strong 2-immunizer in the ROM.

Theorem 11

(Theorem 3 restated).\(f(X, Y) = RO(X||Y)\) is a

\((poly(\lambda ), poly(\lambda ), negl(\lambda ), negl(\lambda ))\)-secure strong 2-immunizer in the AI-ROM.

The intuition behind Theorem 10 is as follows. Even given the secret and public keys for a PRG, public security guarantees that the output of each PRG is unpredicable. Let \(x_1,\dots ,x_q\) and \(y_1,\dots ,y_q\) be two outputs of a PRG, and let us consider the perspective of the compromised PRG generating x. Since this algorithm does not know the seed generating y, each \(y_i\) is unpredictable to it. Thus, it has no way of seeing any of the outputs of the functions \(RO(\cdot || y_i)\). But as long as neither call to the PRG queries the random oracle on \(x_i || y_i\), there will be no detectable relationship between the \(x_i\)’s and \(RO(x_i || y_i)\), and so the immunizer output will seem truly random.

The extension to the AI-ROM in Theorem 11 comes from standard presampling techniques [23, 41], with a full proof included in the full version [6].

4.1 Random Oracle Model Definitions

In the random oracle model (ROM), we treat some function RO as a function chosen uniformly at random. This provides a good heuristic for security when the random oracle is instantiated with some suitable hash function. To argue that some cryptographic primitive is secure in the random oracle model, the randomness of the random oracle must be baked into the underlying game.

Definition 15

We will denote the random oracle by \(\mathcal {O}:A\rightarrow B\). Two distributions X and Y are \((q, t, \epsilon )\)-indistinguishable in the random oracle model if for any oracle algorithm \(D^{\mathcal {O}}\) running in time t making at most q random oracle calls,

figure k

For simplicity, we will typically set \(q = t\). We will define PRG security in the random oracle model to be identical to typical PRG security, but with the computational indistinguishability to be also set in the random oracle model.

Definition 16

Two distributions X and Y are \((s, t, \epsilon )\)-indistinguishable in the AI-ROM if for any oracle function \(z^{\mathcal {O}}\) into strings of length s and for any oracle algorithm \(D^{\mathcal {O}}\) running in time t,

figure l

We similarly define PRG security in the AI-ROM.

Definition 17

A two-input function C is a , if for any PRG (KG) which is \((t, q, \delta )\) publicly secure in the ROM, the PRG (KC(GG)) is a \((t, q, \delta ')\) backdoor secure PRG in the ROM (respectively AI-ROM).

The definition of a \((t, q, \delta , \delta ')\)-secure weak 2-immunizer in the ROM/AI-ROM will be analogous.

Note that in particular our definition for 2-immunizer security in the AI-ROM only requires that the underlying PRG be secure in the ROM. This is a stronger definition, and we do this to model the situation where the auxiliary input represents a backdoor for the underlying PRGs.

4.2 Random Oracle is a 2-Immunizer

To show that a random oracle is a strong 2-immunizer, we adapt the proof structure from [22]. That is, we prove a key information theoretic property about publicly secure PRGs, and then use this property to bound the probability that some adversary queries the random oracle on key values.

In particular, let \(G^X, G^Y\) be two PRGs with outputs \(x_1,\dots ,x_q\) and \(y_1,\dots ,y_q\), and let RO be a random oracle. We will argue that the only part of the PRG game for \(RO(G^X, G^Y)\) which queries \(RO(x_i, y_i)\) is when the 2-immunizer is directly called by the game. This is because all parts of the game will only have access to at most one of \(x_i\) or \(y_i\), and so therefore as the other is information theoretically unpredictable, they will be unable to query \(x_i\) and \(y_i\) to the oracle at the same time.

Afterwards, we will show that RO is still a strong 2-immunizer even in the presence of auxiliary input. We will show this by using the presampling lemma (Theorem ??). The trick we will use is that since our key property is information theoretic, we can set p for the presampling lemma to be exponential in \(\lambda \), and so the security loss we suffer will be negligible.

We begin by stating the following information theoretic lemma. The proof is in the full version of this paper [6].

Lemma 3

(KEY LEMMA) Let \(K:\{0, 1\}^\ell \rightarrow \mathcal{P}\mathcal{K}\times \mathcal{S}\mathcal{K}\), \(G:\mathcal{P}\mathcal{K}\times \mathcal {S}\rightarrow \{0, 1\}^n \times \mathcal {S}\) be a \((t, q, \delta )\) publicly secure PRG. Let \(r \in \{0,1\}^{\ell }\) be some initial randomness. For \(p\in (0,1)\), we say that r is p-weak if for \((pk, sk)\leftarrow K(r)\),

$$\max _{\widetilde{x} \in \{0,1\}^n}\Pr _{x_1,\dots ,x_q \xleftarrow {\$}{\textbf {out}}^q(G_{pk}, \mathcal {U}_{\mathcal {S}})}[x_i = \widetilde{x}\text { for some }i\in [q]] \ge p.$$

Denote

figure n

Then,

$$p' \cdot p^2 \le q^2 \left(\delta + \frac{q}{2^n}\right).$$

Intuitively, we call a public key \(pk\) (described using its initial randomness r) weak if the output of \(G_{pk}\) is predictable. The above lemma gives an upper bound on the probability of a public key being weak. That is, we show (through an averaging argument) that every publicly secure PRG has unpredictable output for most choices of its public parameters.

We now proceed to the proof of Theorem 10.

Proof

Let \(K:\{0, 1\}^\ell \rightarrow \mathcal{P}\mathcal{K}\times \mathcal{S}\mathcal{K}\), \(G:\mathcal{P}\mathcal{K}\times \mathcal {S}\rightarrow \{0, 1\}^n \times \mathcal {S}\) be a \((t, q, \delta )\)-secure PRG. Let D be a distinguisher against f(GG) running in time t. Let HONEST be the distribution

$$(sk, pk) \xleftarrow {\$}K, s_X, s_Y \xleftarrow {\$}\mathcal {S}$$
$$(pk, sk, {\textbf {out}}^q(C(G_{pk}, G_{pk}), (s^X, s^Y)))$$

and let RANDOM be the distribution

$$(sk, pk) \xleftarrow {\$}K, (r_1, \dots , r_q) \xleftarrow {\$}\mathcal {U}_{qm}$$
$$(pk, sk, r_1, \dots , r_q)$$

We want to bound

$$\delta ' = \left| \Pr [D(HONEST) = 1] - \Pr [D(RANDOM) = 1]\right| $$

Let \(q_K, q_G, q_D\) be bounds on the number of times KGD query the random oracle respectively. Note that these are all bounded by t.

Let us consider the case where the distinguisher is given the output of the honest 2-immunizer. We will denote \({\textbf {out}}^q(G, s^X) = x_1, \dots , x_q\) and \({\textbf {out}}^q(G, s^Y) = y_1, \dots , y_q\). Let BAD be the event that there is some i such that \((x_i, y_i)\) is queried to the random oracle more than once. Note that conditioned on \(\overline{BAD}\), the two distributions in the distinguishing game are identical. Thus, \(\delta ' \le \Pr [BAD]\).

We will break BAD up into five cases, and bound each case separately.

  • We define \(BAD_1\) to be the event where there exists ij such that \(x_i = x_j\) and \(y_i = y_j\). This corresponds to \((x_i, y_i)\) be queried to the random oracle more than once by the game itself.

  • We define \(BAD_2\) to be the event that K queries \(x_i, y_i\) for some i.

  • We define \(BAD_3\) to be the event that G queries \(x_i, y_i\) in the process of calculating \({\textbf {out}}^q(G_{pk}, s^X)\).

  • We define \(BAD_4\) to be the event that G queries \(x_i, y_i\) in the process of calculating \({\textbf {out}}^q(G_{pk}, s^Y)\).

  • We define \(BAD_5\) to be the event that D queries \(x_i, y_i\).

Lemma 4

\(\Pr [BAD_1] \le \delta + \frac{q^2}{2^n}\)

First, we will bound \(\Pr [BAD_1]\). Let \(\mathcal {A}\) be an attacker for the underlying PRG game on (KG) which on input \(r_1,\dots ,r_q\) outputs 1 if \(r_i = r_j\) for some \(i \ne j\). It is clear that \(\Pr [\mathcal {A}(pk,{\textbf {out}}^q(G_{pk}, \mathcal {U}_{\mathcal {S}}))\rightarrow 1] \ge \Pr [BAD_1]\), and \(\Pr [\mathcal {A}(pk,\mathcal {U}_{qn}) \rightarrow 1] \le \frac{q^2}{2^n}\). But by public security of the PRG, \(\Pr [\mathcal {A}(pk,{\textbf {out}}^q(G_{pk}, \mathcal {U}_{\mathcal {S}}))\rightarrow 1] - \Pr [\mathcal {A}(pk,\mathcal {U}_{qn}) \rightarrow 1] \le \delta \) Thus, we have

$$\Pr [BAD_1] \le \delta + \frac{q^2}{2^n}$$

Lemma 5

\(\Pr [BAD_2] \le q q_K\sqrt{{\delta + \frac{q}{2^n}}}\)

We will bound \(\Pr [BAD_2]\) using the key lemma. We claim that

figure o

for some suitable value of p. We will then use the key lemma to get an upper bound on \(\Pr [BAD_2]\).

Let r be such that

$$\Pr [BAD_2|(pk, sk) \leftarrow K(r)] \ge \sqrt{\Pr [BAD_2]}$$

We claim then that r is p-weak for some p to be specified later. Let \(F_r\) be the set of random oracle queries made by K(r). We can more precisely state

$$\Pr [BAD_2|(pk, sk) \leftarrow K(r)] = \Pr [(x_i, y_i) \in F_r\text { for some index }i|(pk, sk) \leftarrow K(r)]$$

In particular, we can ignore one output and see that this means

figure p

But since \(|F_r|\le q_K\), this means there must be some element \(\widetilde{x} \in F_r\) such that

figure q

But this precisely means that r is p-weak, for \(p = \frac{\sqrt{\Pr [BAD_2]}}{q_K}\). Thus,

$$\sqrt{\Pr [BAD_2]} \Pr [BAD_2] \le q_K^2 q^2\left(\delta + \frac{q}{2^n}\right)$$

and so as

$$\Pr [BAD_2]^2 \le \sqrt{\Pr [BAD_2]} \Pr [BAD_2],$$

we have

$$\Pr [BAD_2] \le q q_K\sqrt{{\delta + \frac{q}{2^n}}}.$$

Lemma 6

\(\Pr [BAD_3] \le q^2 q_G \sqrt{{\delta + \frac{q}{2^n}}}\)

To bound \(\Pr [BAD_3]\), we will again use the key lemma and show that

$$\Pr _{r \xleftarrow {\$}\mathcal {U}_\ell }[r\text { is }p\text {-weak}] \ge \sqrt{\Pr [BAD_3]}$$

for some suitable value of p.

Let r be such that

figure r

We claim then that r is p-weak for some p to be specified later. Note that since this probability is the average over s of \(\Pr [BAD_3|(pk, sk) \leftarrow K(r), s^X = s]\), there must be some \(\widetilde{s}\) such that

$$\Pr [BAD_3|(pk, sk) \leftarrow K(r), s^X = \widetilde{s}] \ge \sqrt{\Pr [BAD_3]}.$$

Let \(F_{r,\widetilde{s}}\) be the queries made by G when calculating \({\textbf {out}}^q(G_{pk}, \widetilde{s})\). Using a similar argument as in the previous paragraph, we see that there must be some pair \((\widetilde{x}, \widetilde{y}) \in F_{r,\widetilde{s}}\) such that

figure s

But note that \(|F_{r, \widetilde{s}}| \le q\cdot q_G\) as it is generated by running G q times. Thus, r is p-weak for \(p = \frac{\sqrt{\Pr [BAD_3]}}{q \cdot q_G}\). The same algebra as the previous lemma gives us

$$\Pr [BAD_3] \le q^2 q_G \sqrt{{\delta + \frac{q}{2^n}}}$$

Lemma 7

\(\Pr [BAD_4] \le q^2 q_G \sqrt{{\delta + \frac{q}{2^n}}}\)

The proof of this lemma is analogous to the proof for \(\Pr [BAD_3]\).

Lemma 8

\(\Pr [BAD_5]\le q q_D \sqrt{\delta + \frac{q}{2^n}}\)

To bound \(\Pr [BAD_5]\), we first notice that at the point when D first queries \(x_i, y_i\), the only information available to D is the secret key and the output of \(i-1\) random oracle calls. As at this point D has never queried any of its inputs, the probability that D succeeds at querying any input is the same as if D were given only the secret key.

Let us fix any initial randomness \(r \in \{0,1\}^\ell \) such that

figure t

We can clearly see that

figure u

But by union bound we then have

$$\Pr [BAD_5|(pk, sk) \leftarrow K(r)] \le q_D\max _{\widetilde{x} \in \{0,1\}^n}\Pr [x_i = \widetilde{x}\text { for some }i\in [q]].$$

The same reasoning as the previous arguments shows us that r is p-weak for \(p = \frac{\sqrt{\Pr [BAD_5]}}{q_D}\). Applying the key lemma gives us

$$\Pr [BAD_5] \le q q_D \sqrt{\delta + \frac{q}{2^n}}$$

Putting all the lemmas together, we have

$$\delta ' \le \Pr [BAD] \le \left(\delta + \frac{q^2}{2^n}\right) + \left(qq_K + 2q^2q_G + qq_D\right)\sqrt{\delta + \frac{q}{2^n}}$$

Noting that \(q_K,q_G,q_D\le t\) gives us our theorem.

5 Black Box Separation (with Limitations)

Definition 18

Let \(C:\{0,1\}^n \times \{0,1\}^n\rightarrow \{0,1\}^m\) be a function. We call an input \(x \in \{0,1\}^n\) “left-bad” if \(\max _{z \in \{0,1\}^m}\Pr _{y\in \{0,1\}^n}[C(x,y) = z] > \frac{1}{2}\). We define what it means for an input to be “right-bad” analogously.

We say that C is highly dependent on both inputs if

$$\Pr _{(x,y) \xleftarrow {\$}\{0,1\}^{2n}}[\text {{ x} is ``left-bad'' OR { y} is ``right-bad''}] \le negl(\lambda ).$$

Informally, a two-input function C is highly dependent on both inputs if it ignores one of its inputs at most a negligible proportion of the time. This is a rather broad category of functions. In particular, XOR, pairings, inner product, and random oracles are all highly dependent on both inputs. Furthermore, any collision resistant hash function must also be highly dependent on both inputs, otherwise it would be trivial to find a collision.

We show that it is hard to prove security (either weak or strong) for any 2-immunizer C which is highly dependent on both inputs. Note that one of the most common and useful techniques for proving security of cryptographic primitives is to create a black box reduction to some cryptographic assumption. Informally, a black box reduction transforms an attacker for some cryptographic primitive into an attacker for a cryptographic assumption. Thus, if the cryptographic assumption is immune to attack, the cryptographic primitive will be secure.

We show that if a 2-immunizer is highly dependent on both inputs, then there cannot be any black-box reduction of its security to any falsifiable cryptographic assumption.

Theorem 12

(Theorem 4 restated). Let C be a weak 2-immunizer which is highly dependent on both inputs. If there is a black-box reduction showing that C is \((poly(\lambda ), \lambda , negl(\lambda ),\)

\(negl(\lambda ))\)-secure from the security of some cryptographic game \(\mathcal {G}\), then \(\mathcal {G}\) is not secure.

As a random oracle is highly dependent on both inputs, any reasonable hash function should also be highly dependent on both inputs. This implies that despite the fact that a random oracle is a strong 2-immunizer, it may be hard to argue security for any particular instantiation of the random oracle. We sketch the proof of this theorem in the next subsection. For the full argument, see the full version [6].

5.1 Proof Sketch for Theorem 12

The simulatable attacker paradigm. The simulatable attacker paradigm, first introduced by [14] and formalized by [44], is a method for transforming a black-box reduction into an attack against the underlying assumption. This paradigm was first used to prove black-box separations from all falsifiable assumptions in [26].

In particular, let \(\mathcal {C}\) be a cryptographic protocol with a black-box reduction to a cryptographic assumption \(\mathcal {G}\). Formally, we will describe the black-box reduction as an oracle algorithm \(\mathcal {B}^{\cdot }\) which breaks the security of \(\mathcal {G}\) whenever its oracle is a (possibly inefficient) adversary breaking the security of \(\mathcal {C}\).

A simulatable attack against \(\mathcal {C}\) is an (inefficient) attack \(\mathcal {A}\) which breaks the security game of \(\mathcal {C}\), but which can be simulated by an efficient algorithm \({\textbf {Sim}}\). In particular, oracle access to \(\mathcal {A}\) and \({\textbf {Sim}}\) should be indistinguishable to the black-box reduction \(\mathcal {B}^{\cdot }\). If this occurs, then since \(\mathcal {B}^{{\textbf {Sim}}}\) is indistinguishable from \(\mathcal {B}^{\mathcal {A}}\), \(\mathcal {B}^{{\textbf {Sim}}}\) is an efficient attack breaking the security game of \(\mathcal {G}\).

Note that in order for this paradigm to make sense, it needs to be the case that the simulator has more capabilities than the inefficient adversary, otherwise the simulator itself would be an attack for \(\mathcal {C}\). In practice, this is done by either restricting the oracle queries made by the black-box separation \(\mathcal {B}^{\cdot }\) or by restricting the power of the attacker \(\mathcal {A}\).

Black-box separations for 2-stage games. In 2013, Wichs showed a a general framework for proving that two-stage games cannot be reduced to any falsifiable assumption [44]. In a two-stage security game the adversary consists of two algorithms which each have individual state, but are not allowed to communicate. Thus, a simulatable attack consists of the inefficient attack as well as two simulators where the simulators do have shared state. This means that it is conceivable to have a simulator \({\textbf {Sim}}\) for which oracle access is indistinguishable from \(\mathcal {A}\).

Note that if we have a simulatable attack of this form, then this simulator will fool every (efficient) black-box reduction. Thus, if we can prove that for every construction there exists an simulatable attack, this gives a black-box separation of the security definition from any falsifiable assumption.

Our simulatable attack. Note that an adversary against a 2-immunizer consists of both a set of PRGs and a distinguisher. Here, the PRGs and the distinguisher are not allowed to share state, and so we can hope to construct a simulatable attack in the style of [44].

Given C any candidate 2-immunizer, let \(G^X,G^Y\) be random functions and let D(y) be the algorithm which outputs 1 if there exists an \((s^X,s^Y)\) such that \(y = {\textbf {out}}^q(C(G^X, G^Y), (s^X,s^Y))\). It is clear that \(G^X,G^Y,D\) is an inefficient attack breaking the security of C.

To simulate this, we simply replace \(G^X,G^Y\) with a lazy sampling oracle. That is, the first time \(G^X\) sees s, it will respond with a random value, and it will use the same response for future queries of s. To simulate D, the simulator will check if there exists an already queried \((s^X,s^Y)\) such that \(y = {\textbf {out}}^q(C(G^X, G^Y), (s^X,s^Y))\). Since the adversary is polynomially bounded, there will only be a polynomial number of already queried points, and so this simulator is efficient.

It turns out that the only way to distinguish this simulator from the inefficient adversary is to find some y such that \(y = {\textbf {out}}^q(C(G^X, G^Y), (s^X,s^Y))\) for either \(s^X\) or \(s^Y\) unqueried. If neither \(s^X\) or \(s^Y\) has been queried before, then by a counting argument it is impossible to guess such a y. But if \(s^X\) has been queried before, if C ignores \(s^Y\) then it is possible to guess \({\textbf {out}}^q(C(G^X, G^Y), (s^X,s^Y))\) without querying \(s^Y\). To avoid this problem, we simply assume that the output of C is dependent on both of its inputs, as in Definition 18.