Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Cryptographic Hash Functions from Expander Graphs

Cryptology is the science and art of guaranteeing secure communications in the presence of adversaries. To design and analyze complex protocols, cryptographers combine simpler cryptographic primitives such as encryption schemes, digital signatures, block ciphers and hash functions. These elementary primitives may themselves be constructed from notoriously “hard” number-theoretic mathematical problems like the integer factorization and the discrete logarithm problems. A cryptographer’s security “proof” will typically show that the protocol cannot be broken without solving one of these upposedly “hard” problems. The actual “hardness” of the problems can then be investigated separately by mathematicians and cryptanalysts.

1.1 Cryptographic Hash Functions

Hash functions are a very important primitive in many cryptographic protocols, including message authentication codes, digital signatures, password storage applications or pseudorandom number generation. A cryptographic hash function \(H:\{0,1\}^*\rightarrow \{0,1\}^n\) takes arbitrary-length message inputs and it returns short, fixed-length hash values. A hash function is usually required to at least satisfy the following three properties:

  • Collision resistance: it must be hard to compute \(m,m'\in \{0,1\}^*\), \(m'\ne m\), such that \(H(m)=H(m')\).

  • Second preimage resistance: given \(m\in \{0,1\}^*\), it must be hard to compute \(m'\in \{0,1\}^*\), \(m'\ne m\), such that \(H(m)=H(m')\).

  • Preimage resistance: given \(h\in \{0,1\}^n\), it must be hard to compute \(m\in \{0,1\}^*\) such that \(h=H(m)\).

Hash funtions may also be required to satisfy much stronger security properties such as “behaving like a random function”. To satisfy these properties, classical hash functions like MD5 or SHA follow a heuristic design consisting in cutting the messages into smaller pieces, mixing the pieces together with some non-linear transformations, and then iterating the whole process until the output is “random enough”. In particular, the security of these hash functions must be studied independently of any “neat” mathematical problem. Collision-resistant hash functions can also be built from the integer factorization or the discrete logarithm problems [8, 12], but the resulting constructions are by far too inefficient in practice.

1.2 Cayley Hash Functions

Let G be a finite (non Abelian) group, and let \(S=\{g_0,g_1,\ldots ,g_{k-1}\}\) be a generator set for this group. The Cayley graph associated to (GS) is the directed graph (VE) such that any vertex \(v\in V\) corresponds to one group element and there is an edge from \(v_1\) to \(v_2\) if and only if \(v_2=v_1g\) for some \(g\in S\). The Cayley hash function associated to (GS) is defined as follows: the message m is first mapped onto \(m_1m_2\ldots m_N\in \{0,\ldots ,k-1\}^*\) according to some deterministic procedure, then the group element

$$h:=\prod _{i=1}^Ng_{m_i}$$

is computed, and h is finally mapped onto some bitstring. We remark that computing this product step by step corresponds to making a walk in the corresponding Cayley graph. The initial and final mapping have usually no impact on security, so we simply ignore them in the remaining of this paper. In particular, we slightly change the definition of a hash function such that it can take messages from \(\{0,\ldots ,k-1\}^*\) and return a hash value in G.

The first hash function following this design was proposed by Zémor in 1991 [37]. His choice of parameters was quickly invalidated by Tillich and himself [34], so they suggested new parameters with the group \(SL(2,\mathbb {F}_{2^n})\). The design was rediscovered ten years later by Charles, Goren and Lauter [7], this time with parameters taken from Lubotzky-Philips-Sarnak’s Ramanujan graphs [22]. We point out that all the parameters used in these particular instances were very “special”: the first two sets of parameters were chosen for efficiency reasons, and the last one to ensure that the hash values were distributed as uniformly as possible.

Cayley hash functions have a number of interesting properties compared to other hash functions. Their computation can be easily parallelized for a better efficiency, and even the non-parallelized version is competitive with classical hash functions for some parameters [9]. The near-uniformity of the output distribution can be linked with the expansion properties of the corresponding Cayley graphs [7, 16]. Last but not least, the main security properties of Cayley hash functions trivially rely on some “neat” mathematical problems.

1.3 Rubik’s for Cryptographers

For any G and S as before, we define the three following problems:

  • Balance problem: find \(m_i,m_i'\in \{0,\ldots ,k-1\}\) such that \(\prod _{i=1}^\ell g_{m_i}=\prod _{i=1}^{\ell '} g_{m_i'}\) and \(\ell ,\ell '\) “small”.

  • Representation problem: find \(m_i\in \{0,\ldots ,k-1\}\), such that \(\prod _{i=1}^\ell g_{m_i}=1\) and \(\ell \) “small”.

  • Factorization problem: given an element \(h\in G\), find \(m_i\in \{0,\ldots ,k-1\}\) such that \(\prod _{i=1}^\ell g_{m_i}=h\) and \(\ell \) “small”.

Clearly, the collision resistance of a Cayley hash function is equivalent to the hardness of the corresponding balance problem; its preimage resistance is equivalent to the hardness of the factorization problem; and the function cannot be second preimage resistant if the representation problem is not hard. Among the three problems, the factorization problem is the hardest one. The balance problem can be trivially solved if some generators in S commute, hence the restriction to non-Abelian groups. Interestingly, the factorization problem in non-Abelian groups can also be seen as a generalization of the well-known Rubik’s cube [28].

Although not classical in cryptography, the above problems are related to long-standing open problems in graph and group theory. In particular in the late eighties, Babai conjectured that there exists a constant \(c>0\) such that the diameter of any Cayley graph of any finite simple non-Abelian group G is bounded by \(\log ^c|G|\) [2, 15]. For an appropriate definition of “small”, the factorization problem essentially requires a constructive proof of Babai’s conjecture. Replacing “small” by “minimal”, it becomes equivalent to the NP-hard well-known word problem [11, 17].

1.4 Cryptanalysis Results

Babai’s conjecture has recently attracted a lot of attention after a breakthrough result of Helfgott for the group SL(2, p) (p prime) [15]. It is now proved for any generator set of any group of Lie type with a bounded rank [6, 29]. For permutation groups, a slightly weaker quasipolynomial bound on the diameter has been recently obtained for all generator sets [14]. Most of these results are combinatoric and non constructive in nature. Constructive proofs are known for almost all generator sets of permutation groups [3], so these groups should be avoided in the Cayley hash construction. When the group order is particularly smooth, the factorization problem can often be solved with subgroup attacks [10, 28, 32] so groups that possess a rich subgroup structure should also be avoided in teh construction.

In all other groups, the factorization problem can only be solved efficiently for a few generator sets, particularly chosen to make it easier [1, 4, 18, 19, 21, 30]. On the other hand, all the particular parameters suggested for Cayley hash functions have now been broken [13, 25, 27, 33, 35]. We stress, however, that all these parameters were very “special”: in all cases the parameters were either very “small” to optimize the efficiency [33, 37], or very “symmetric” to optimize the output distribution of the function [7, 26]. The cryptanalysis attacks exploited these particularities and are currently defeated with slight modifications of the generators.

2 The End of the Story ?

At first sight, the cryptanalysis of the four Cayley hash instances may cast some doubts on the design, but these doubts are unjustified or at least premature. After all, even well-established cryptographic schemes such as RSA can be easily broken for some “small” parameters [36]. In fact, the actual hardness of the factorization problem in non-Abelian groups is still a widely open problem today, despite many years of research by both the mathematics and cryptography communities. Any progress in that direction would lead to interesting constructive or destructive applications in cryptography, as well as other applications outside cryptography.

2.1 Open Problem: Finding Good Parameters for the Cayley Hash Construction

Cryptographic hash functions should be both secure and efficient to compute. The parameters chosen by Zémor [37] and Tillich and Zémor [34] led to a computation cost of only a few field additions per message bit, but these parameters turned out to be insecure. On the other hand, generic generator sets in the same groups seem secure today, but require a few field multiplications per message bits, which is by far less efficient. Slight modifications of previously broken functions have been proposed in [25, 28, 33] to counter existing attacks. Their actual security is still an open problem today.

2.2 Open Problem: Break All Relevant Parameters

More generally, the security of Cayley hash functions is still a widely open problem for most groups and generator sets.

Due to the particularly efficient arithmetic over binary fields, it is interesting to start with the group \(SL(2,2^n)\) used in the Tillich-Zémor hash function. The study of the factorization problem for generic generator sets has been initiated in [24]. In that paper it is shown through various heuristic reductions that it is enough to focus on generator sets with two matrices \(A,B\in SL(2,2^n)\) such that

  • \(A:=\left( {\begin{matrix}\lambda &{}0\\ 0&{}\lambda ^{-1}\end{matrix}}\right) \) is diagonal and \(B:=\left( {\begin{matrix}w+1&{}w\\ w&{}w+1\end{matrix}}\right) \) is orthogonal, for \(\lambda \in \mathbb {F}_{2^n}^*\) and \(w\in \mathbb {F}_{2^n}\).

  • \(A:=\left( {\begin{matrix}t_A&{}1\\ 1&{}0\end{matrix}}\right) \) and \(B:=\left( {\begin{matrix}t_B&{}1\\ 1&{}0\end{matrix}}\right) \) where \(t_A,t_B\in \mathbb {F}_{2^n}\).

Indeed, any algorithm solving the factorization problem in \(SL(2,2^n)\) for a significant subset of these parameters would also lead to a heuristic algorithm solving the factorization problem for any generator set of \(SL(2,2^n)\). Note that we obtain parameters equivalent to the (already broken) Tillich-Zémor parameters if \(t_A=t_B+1\), but the security of the above parameters is still an open problem in general [24, 28].

After the factorization problem is solved for \(SL(2,2^n)\), the next most interesting groups for the construction (and the cryptanalysis) will be other special linear groups over finite fields SL(mK). The group SL(2, K) is naturally embedded into SL(mK). Independently of the work on \(SL(2,2^n)\), it will be interesting to relate the security of SL(mK) to the security of SL(2, K). Such a reduction is another interesting open problem.

2.3 Open Problem: (Constructive) Proof of Babai’s Conjecture

Solving the factorization problem may appear harder than proving Babai’s conjecture at first sight since non constructive proofs are useless for the cryptanalysis. However, a Cayley hash function will already be considered as broken with a heuristic attack working with a “good enough” probability, whereas a proof of Babai’s conjecture shall eliminate all heuristic assumptions and be valid for all parameters. Proving Babai’s conjecture in all generality is of course a big open problem. Replacing or adapting known partial non-constructive proofs is another interesting open problem.

2.4 Open Problem: Further Cryptographic Applications

As discussed in the introduction, many cryptographic protocols ultimately rely on the actual hardness of some mathematical problems. The factorization problem in non-Abelian groups may very well be “hard enough” for cryptography when using generic parameters. In that case, it would be interesting to look for further cryptographic applications of the problem.

Abelian finite groups are ubiquituous in cryptography, but non-Abelian groups have not had the same success, in particular for public key cryptography [31]. While some computational problems in non-Abelian groups are seemingy hard in general, embedding a trapdoor (necessary for public key cryptography) has always required to specialize these problems to new, much weaker instances. It is an interesting open problem today to build a secure public key encryption scheme from the factorization problem in a non-Abelian group, or more generally from any computational problem in a non-Abelian group.

2.5 Open Problem: New Applications of Factorization Algorithms in Non-Abelian Groups

While the factorization problem in non-Abelian groups has interesting cryptographic applications when it is “hard”, easy instances might also find interesting applications outside cryptography. Cayley graphs tend to be good expanders graphs [5, 20], and expander graphs have a tremendous number of applications in building optimal networks, error correcting codes or derandomized algorithms, as well as in number theory and group theory [16, 23]. We now have good algorithms solving the factorization problem for some particular instances. It will be interesting to find new applications of these algorithms, in particular to improve some of the various applications of expander graphs.

3 Conclusion

The appealing properties of Cayley hash functions justify an extensive study of their security. Although particular instances were broken in the past, the generic underlying problems can still be considered as potential cryptographic “hard” problems, in particular due to their connection to an old-standing conjecture of Babai. We described several open problems related to the cryptanalysis of these functions and to new applications of the underlying problems.