Keywords

1 Introduction

The proliferation of cloud computing has attracted many attentions from academic and industrial communities since it provides powerful computing capability and considerable storage space. It can reduce Internet users’ local data management and maintenance cost significantly. Users can access cloud services whenever and wherever once they are authorized by service providers. Due to its merits, companies and individuals are willing to store their data in a remote cloud. Since users will lose their control on data after outsourcing their data to the cloud, they concern that the data may be illegally accessed by the cloud server administrator and network attackers. Considering the confidentiality of the outsourced data, users often encrypt it first, and then store the ciphertext to cloud servers. However, it is difficult to search an “exact” file among encrypted data stored in cloud.

In 2000, Song et al. [45] first proposed the definition of searchable encryption (SE). In [45], a data owner is allowed to encrypt both files and the corresponding keywords, and store the ciphertexts to cloud. When searching for a file with keywords \(W \), the data user generates a trapdoor using his/her secret key and further sends the trapdoor to the server. After receiving the trapdoor, the server searches out the encrypted file where the keywords \(W \) matches, and returns the search result to the user. Finally, the user can use the secret key to decrypt the ciphertext and obtain the file. In 2004, Boneh et al. [8] introduced the concept of public-key encryption with keyword search (PEKS), and constructed a concrete PEKS scheme based on bilinear groups with prime order. In 2006, Khader [25] proposed an identity-based PEKS derived from identity-based encryption (IBE). In 2007, Abdalla et al. [1] presented a generic construction of PEKS by using anonymous IBE, and discussed the consistency in PEKS schemes.

Previous PEKS schemes can only support simple query and the size of ciphertexts and trapdoor (search token) is super-polynomial in the number of keywords. In practice, fine-grained access control is required. In 2013, Lai et al. [28] proposed an expressive searchable encryption scheme based on KP-ABE scheme. This scheme supports any monotonic formula, for example, (“sender : Bob AND priority : urgent OR subject : recruitment”). However, the trapdoor can leak the information of keywords, namely the test algorithm can detect whether the encrypted data contains some keywords in trapdoor. In 2014, Lv et al. [38] proposed an expressive and secure asymmetric searchable encryption (ESASE) scheme, which was based on an asymmetric bilinear group with composite order and supports non-monotonic query. Nevertheless, the scheme only disclosed whether the keywords in the trapdoor are primed or not. In 2016, Cui et al. [14] proposed an efficient and expressive keyword searchable encryption scheme constructed in a bilinear group with prime order. The scheme is selectively secure in the standard model. It supports keyword search policies in terms of conjunctive, disjunctive and any monotonic Boolean formula. However, it brings some critical issue to search efficiency. In most existing expressive searchable encryption schemes derived from ABE, both the size of ciphertext and the search cost are linear in the number of keywords. Specifically, in the test (search) algorithm, it usually requires one pairing operation for a single keyword (embedded in a given ciphertext). Hence, the existing expressive searchable encryption schemes built on top of ABE are not efficient and scalable.

Attrapadung et al. [2] and Hohenberger et al. [24] presented KP-ABE schemes with constant-size ciphertext and fast decryption, respectively. In 2014, Lai et al. [27] proposed a new KP-ABE with constant-size ciphertext and fast decryption, which is adaptability secure in the standard model. KP-ABE schemes do not consider the privacy issue of attributes associated with ciphertext. However, searchable encryption requests that ciphertext should not reveal any information about keywords except that a valid trapdoor is provided.In this paper, we propose a new efficient key-policy attribute-based searchable encryption (KP-ABSE) scheme which is derived from an asymmetric bilinear group with prime order. In this scheme, the privacy of keyword in both ciphertext and trapdoor are addressed. Moreover, both the size of ciphertext and the computation cost of the test algorithm are constant. Compared with expressive searchable encryption based on bilinear groups with prime order, our work is more efficient.

1.1 Technical Roadmap

Protecting Privacy of Keywords in Ciphertext. (1) We use anonymity from the asymmetric technique [15] to encrypt keywords in group \({\mathbb {G}}\); while trapdoors are generated in group \(\hat{\mathbb {G}}\) to prevent cloud servers, and adversaries from raising keyword guess attacks using pairing operations [5]. As claimed in [15], asymmetric bilinear groups provide good properties, including compact representation of group elements, a flexible choice of elliptic curve implementation [18] and strong security [20]. (2) We use the linear splitting technique [9] to split the random exponent used to hide keywords into two parts. As a result, adversaries cannot obtain any information about keywords even if they acquire the ciphertext and public parameters. Secret keys are randomized in the test algorithm.

Protecting Privacy of Keywords in Access Structure. We divide each keyword into two parts: the keyword name and the keyword value [26]. In practice, keyword values are more sensitive than keyword names. If the set of attributes associated with a users private key does not satisfy the access structure associated with a ciphertext, attribute values in the access structure are hidden, while other information, such as attribute names, about the access structure is public. Suppose that the access structure in personal health database is (illness = diabetes) OR (gender = male) OR (department = medical) OR (affiliation= city hospital) where illness, gender, department and affiliation are keyword names and diabetes, male, medical and city hospital are keyword values. The keyword names contains less sensitive information and can be released, while keyword values are very sensitive and should be kept secret. Hence, in our scheme, we mainly consider to protect the privacy of keyword values. PKES is subject to the offline keywords dictionary guessing attacks since anyone who knows the trapdoor and public parameters can conclude the value embedded in the trapdoor by executing exhaustive search. To prevent the above attacks, the designated technique [43] is used. The idea is that trapdoors are encrypted under the public key of the cloud server such that adversaries cannot acquire any information about keywords without knowing the secret key. Therefore, trapdoors can be transferred in public channels.

1.2 Contributions

We propose a new key-policy attribute-based search encryption scheme (KP-ABSE) which is derived from KP-ABE in asymmetric bilinear group with prime order. The proposed scheme has the following good properties: (1) It is expressive and supports any monotonic access structure; (2) It has constant-size ciphertext and supports fast decryption; (3) The number of pairing operations needed in the test algorithm is constant. Therefore, it reduces the computation cost on cloud server side as well as communication cost between the data users and cloud. One disadvantage of our scheme is that the size of trapdoors is \(\mathcal {O}(n\cdot \ell )\), where n is the number of attributes in the system and \(\ell \) is the number of leaf nodes in the access structure. Note that we will regard this as an open problem of our research work. However, depending on applications, one should take into consideration if the increase of trapdoor size is worthy.

1.3 Related Work

Attribute-Based Encryption. To implement fine-grained access control on sensitive data, Sahai and Water [44] introduced the definition of attribute-based encryption (ABE). ABE schemes can be classified into two types: key-policy ABE (KP-ABE) [21] and ciphertext-policy ABE (CP-ABE) [4]. In a KP-ABE scheme [21], secret keys are associated with access structures; while ciphertexts are labeled with sets of attributes. A user can decrypt a ciphertext if and only if the access structure associated with his secret key can be satisfied by the attributes labeled in ciphertexts. On the contrary, in a CP-ABE scheme [4], secret keys are labeled with sets of attributes; while ciphertexts are associated with access structures.

Goyal et al. [21] proposed a KP-ABE scheme which supports any monotonic access structure. Later, Ostrovsky et al. [41] presented a KP-ABE system which supports non-monotonic access structures. Lewko et al. [29] proposed the first fully secure KP-ABE scheme supporting any monotonic access structure. Chase et al. [10, 11] considered multi-authority KP-ABE schemes. The first CP-ABE was proposed by Bethencourt et al. [4] and was proven to be secure in the generic group model. Later, Cheung and Newport [12] presented a CP-ABE scheme which is secure in the standard model; while, it can only support restricted access structures, for example AND gate. Lewko et al. [30] considered multi-authority CP-ABE schemes to reduce the trust on central authority. Some ABE variants and applications can be seen in [32, 33, 39, 40, 46].

Attribute-Based Encryption with Fast Decryption. In KP-ABE schemes, both the size of the ciphertext and the decryption cost are linear with the number of required attributes. To reduce the size of ciphertext and decryption cost, some new KP-ABE were presented [2, 27, 44]. Meanwhile, in CP-ABE scenario, the size of ciphertext and decryption cost were also considered. Emura et al. [16] proposed a CP-ABE scheme with constant-size ciphertext which can only supports restricted access structures, such as AND gate. Herranz et al. [23] described a CP-ABE scheme with constant-size ciphertext which supports threshold access structures. Hohenberger [24] proposed a KP-ABE with fast decryption. In [24], the decryption cost is constant, instead of linear with the number of required attributes. In 2014, Lai et al. [27] proposed a KP-ABE with constant-size ciphertext and fast decryption.

Keyword search over Encrypted Data. Boneh et al. [8] initiated the research on PEKS and gave a specific construction which only supports equality queries. Abdalla et al. [1] addressed the consistency in PEKS schemes, and analyzed the relationship between PEKS and anonymous IBE. To guarantee the correctness of the searching results, verifiable keyword search schemes have been proposed [3, 17, 42]. In these schemes, each keyword is represented as the root of one polynomial. It is easy to check whether a keyword is included by evaluating the polynomial on the keyword and verify whether the output is zero or not. Zheng et al. [48] proposed a novel PEKS called verifiable attribute-based keyword search (VABKS). This allows legitimate data users to outsource the (often costly) search operations to cloud servers and verify whether cloud servers have faithfully executed the search operations. Some variants of ABE searchable encryption have been proposed in [34,35,36,37].

1.4 Organization

The rest of this paper is organized as follows. In Sect. 2, we briefly review definitions and models used in this paper. Section 3 describes the preliminaries used throughout this paper and notions of KP-ABSE. In Sect. 4, a concrete KP-ABSE scheme is presented. We compare our work with other related works in Sect. 5. Section 6 concludes the paper.

2 System Definitions

2.1 System Algorithms

A key-policy attribute-based search encryption (KP-ABSE) system includes four parties, namely, data owner, cloud server, Trusted Key Generator (TKG), and data user.

Definition 1

A KP-ABSE system consists of the following algorithms [14]:

  1. 1.

    \(Setup(1^{\lambda })\rightarrow (pars, msk)\): intaking a security parameter \(\lambda \), the TKG runs the setup algorithm to construct the public parameters pars, and the master secret key msk. The pars is published, while the msk is kept secret.

  2. 2.

    \(sKeyGen(pars)\rightarrow (pk_s, sk_s)\): intaking pars, the TKG runs the server key generation algorithm to construct the public key \(pk_s\) and the private key \(sk_{s}\) for the cloud server.

  3. 3.

    \(Encrypt(pars, \mathbf W )\rightarrow CT\): intaking pars, and a set of keywords W, a data owner runs the encryption algorithm to output a ciphertext CT.

  4. 4.

    \(Trapdoor(pars, msk, pk_s,\mathbb {A})\rightarrow T_\mathbb {M}\): intaking pars, msk, \(pk_s\) and an access structure \(\mathbb {A}\) (corresponding to some keyword set), the TKG runs the trapdoor generation algorithm to construct a trapdoor \(T_\mathbb {M}\), and further sends \(T_\mathbb {M}\) to the cloud server.

  5. 5.

    \(Test(pars, sk_s, CT, T_\mathbb {M}) \rightarrow 0/1\): Intaking pars, \(sk_s\), CT and \(T_\mathbb {M}\), the cloud server runs the test algorithm. It outputs 1 if the keyword set embedded in CT matches the access structure in \(T_\mathbb {M}\), and 0 otherwise.

Correctness: A key-policy attribute-based search encryption is correct if

$$\begin{aligned} \Pr \left[ \begin{array}{l|l} &{} Setup(1^{\lambda })\rightarrow (pars, msk);\\ Test(pars, sk_s, CT, T_\mathbb {M}) \rightarrow 1 &{} Encrypt(pars, \mathbf W )\rightarrow CT;\\ &{} sKeyGen(pars)\rightarrow (pk_s, sk_s);\\ &{} Trapdoor(pars, msk, pk_s,\mathbb {A})\rightarrow T_\mathbb {M} \end{array}\right] =1. \end{aligned}$$
Fig. 1.
figure 1

System workflow

2.2 System Workflow

The architecture of our system workflow is shown in Fig. 1, which is composed of four entities: a trusted key generator (TKG) who publishes the system parameter and holds a master private key and is responsible for trapdoor generation for the system. We may regard the TKG as trusted device(s), like TPM. A user may make use of this device in some untrusted computers (like those in library or public area) to generate a token for further search. But the device may not have sufficient knowledge about positive or negative cases (on access control rules). Because it may not be allowed to access, say the access control list. data owners who outsource encrypted data to a public cloud, data users who are privileged to search and access encrypted data, and a cloud server who executes the keyword search operations for data users. To enable the cloud server to search over ciphertexts, the data owners append every encrypted document with encrypted keywords. A data user issues a trapdoor request by sending a keyword access structure to the TKG which generates and returns a trapdoor corresponding to the access structure. After obtaining a trapdoor, the data user sends the trapdoor and the corresponding partial hidden access structure (i.e., the access structure without keyword values) to the designated cloud server. The latter performs the testing operations between each ciphertext and the trapdoor using its private key, and forwards the matching ciphertexts to the data user.

2.3 Adversary Models

In this paper, we assume that data owner, data user and the cloud server are semi-trusted, while the TKG is fully trusted. However, for a data user, he/she may choose to guess the keyword set embedded into a given ciphertext without the help of the server. For a “curious” server, it may curiously guess the keyword set in the ciphertext of which the corresponding search trapdoor is not given; it may also guess the keyword information from a given search trapdoor. Therefore, we define the following three security models.

Indistinguishability Against Chosen Keyword-Set Attacks (IND-CKA). This security model focuses on the privacy of the keyword set associated with a given ciphertext. There are two kinds of adversaries in this model, one is outside-attacker, and the other is the cloud server itself. Below, we define two security games by constructing interactions between a challenger \(\mathcal B\) and an adversary \(\mathcal A\).

IND-CKA Security for Outsider. This security game between \(\mathcal{A}_1\) and \(\mathcal B\) is used to show that a system outsider, without the help of the cloud server, cannot tell if a given ciphertext contains some specified keyword set (here the outsider is allowed to commit to two known keyword sets at the outset of the game).

Definition 2

A KP-ABSE scheme is IND-\(CKA^{\mathcal{A}_1}\) secure if no PPT adversary \(\mathcal{A}_1\) can win the game below with non-negligible advantage [14].

  1. 1.

    Init. \(\mathcal{A}_1\) commits to two equal length challenge keyword sets \(\mathbf W _0^*\), \(\mathbf W _1^*\).

  2. 2.

    Setup. \(\mathcal B\) runs \(Setup(1^{\lambda })\), and further sends pars to \(\mathcal{A}_1\). It runs sKeyGen(pars) and next returns \(pk_s\) to \(\mathcal{A}_1\).

  3. 3.

    Phase 1. \(\mathcal{A}_1\) issues search trapdoor queries to \(\mathcal B\) by submitting (\(\mathbb {M}_1\), \(\rho _1\), \(\{W_{\rho _1(i)}\}\)), ..., (\(\mathbb {M}_{q_1}\), \(\rho _{q_1}\), \(\{W_{\rho _{q_1}(i)}\}\)). \(\mathcal B\) returns the corresponding trapdoors to \(\mathcal{A}_1\) by running the algorithm Trapdoor.

  4. 4.

    Challenge. \(\mathcal B\) returns the challenge ciphertext \(CT^{*}=Encrypt(pars\), \(\mathbf W _{\beta }^*)\) to \(\mathcal{A}_1\), where \(\beta \in _R \{0,1\}\). Note that the challenge ciphertext cannot match any trapdoor constructed in Phase 1 (namely, both of the challenge keyword sets cannot match the given trapdoors).

  5. 5.

    Phase 2. \(\mathcal{A}_1\) continues making queries as in Phase 1, by issuing (\(\mathbb {M}_{q_1+1}\), \(\rho _{q_1+1}\), \(\{W_{\rho _{q_1+1}(i)}\}\)), ..., (\(\mathbb {M}_q\), \(\rho _q\), \(\{W_{\rho _q(i)}\}\)), with a restriction that the queries cannot match the given challenge keyword sets.

  6. 6.

    Guess. \(\mathcal{A}_1\) outputs a guess bit \(\beta ' \in \{0,1\}\). If \(\beta =\beta '\), \(\mathcal{A}_1\) wins.

The advantage of \(\mathcal{A}_1\) is defined as \(Adv_{{\mathcal{A}_1}}(1^{\lambda })=|Pr[\beta '=\beta ]-\frac{1}{2}|\).

IND-CKA Security for the Cloud Server. This security game between \(\mathcal{A}_2\) and \(\mathcal B\) is used to show that the cloud server, without a valid search trapdoor, cannot tell if a given ciphertext contains some specified keyword set (here the cloud server is allowed to commit to two “known” keyword sets in advance).

Definition 3

A KP-ABSE scheme is IND-\(CKA^{\mathcal{A}_2}\) secure if no PPT adversary \(\mathcal{A}_2\) can win the game below with non-negligible advantage [14].

  1. 1.

    Init. \(\mathcal{A}_2\) commits to two equal length challenge keyword sets \(\mathbf W _0^*\), \(\mathbf W _1^*\).

  2. 2.

    Setup. \(\mathcal B\) runs \(Setup(1^{\lambda })\) to send pars to \(\mathcal{A}_2\). It further runs sKeyGen(pars) to return \(pk_s, sk_s\) to \(\mathcal{A}_2\).

  3. 3.

    Phase 1. \(\mathcal{A}_2\) issues search trapdoor queries to \(\mathcal B\) by submitting (\(\mathbb {M}_1\), \(\rho _1\), \(\{W_{\rho _1(i)}\}\)), ..., (\(\mathbb {M}_{q_1}\), \(\rho _{q_1}\), \(\{W_{\rho _{q_1}(i)}\}\)). For each query (\(\mathbb {M}_j, \rho _j, \{W_{\rho _j(i)}\}\)), \(j\in [1,q_1]\), \(\mathcal B\) returns the corresponding trapdoor \(T_{\mathbb {M}_j}\) to \(\mathcal{A}_2\) by running the algorithm Trapdoor.

  4. 4.

    Challenge. \(\mathcal B\) randomly chooses \(\beta \in \{0,1\}\) and returns the challenge ciphertext \(CT^{*}=Encrypt(pars\), \(\mathbf W _{\beta }^*)\) to \(\mathcal{A}_2\) with a restriction that the challenge ciphertext cannot match any trapdoor given in Phase 1.

  5. 5.

    Phase 2. \(\mathcal{A}_2\) continues making queries by issuing (\(\mathbb {M}_{q_1+1}\), \(\rho _{q_1+1}\), \(\{W_{\rho _{q_1+1}(i)}\}\)), ..., (\(\mathbb {M}_q\), \(\rho _q\), \(\{W_{\rho _q(i)}\}\)), with a restriction that the queries cannot match the given challenge keyword sets.

  6. 6.

    Guess. \(\mathcal{A}_2\) outputs a guess bit \(\beta ' \in \{0,1\}\). If \(\beta =\beta '\), \(\mathcal{A}_2\) wins.

The advantage of \(\mathcal{A}_2\) is defined as \(Adv_{{\mathcal{A}_2}}(1^{\lambda })=|Pr[\beta '=\beta ]-\frac{1}{2}|\).

For \(\mathcal{A}\in \{\mathcal{A}_1,\mathcal{A}_2\}\), an KP-ABSE system is selectively IND-CKA secure if the advantage function referring to the security \(Game^{(IND)}_{\varPi ,\mathcal A}\), \(Adv^{(IND)}_{\varPi ,\mathcal A}(\lambda )=Pr[\beta \ne \beta ']-\frac{1}{2}\) is negligible in the security parameter \(\lambda \) for any probabilistic polynomial time adversary algorithm \(\mathcal A\).

3 Preliminaries

3.1 Bilinear Maps

Let \(\mathbb {G}\), \(\hat{\mathbb {G}}\) and \(\mathbb {G}_T\) be all multiplicative groups of prime order \(p \in \varTheta (2^{\lambda })\), respectively generated by g,\(\hat{g}\) and \(e: \mathbb {G}\times \hat{\mathbb {G}}\rightarrow \mathbb {G}_T\) is an efficient bilinear map with the following properties: (1) Bilinearity: for all \(a, b \in _R \mathbb {Z}_p\), \(e(g^a,\hat{g}^b)=e(g,\hat{g})^{ab}\); (2) Non-degeneracy: \(e(g,\hat{g})\ne 1_{\mathbb {G}_T}\), where \(1_{\mathbb {G}_T}\) is the unit of \(\mathbb {G}_T\); (3) Computability: for all \(g\in \mathbb {G}\) and \(\hat{g}\in \hat{\mathbb {G}}\), \(e(g,\hat{g})\) can be computed efficiently.

3.2 Complexity Assumptions

Definition 4

Asymmetric Decisional Bilinear Diffie-Hellman (DBDH) Assumption [47] is that all Probabilistic Polynomial Time (PPT) algorithm \(\mathcal A\) have an advantage negligible in \(\lambda \) of distinguishing \(e(g,\hat{g})^{abc}\in \mathbb {G}_T\) from a random element in \(\mathbb {G}_T\) by given the vector \(y=(g,g^a,g^c,\hat{g},\hat{g}^a,\hat{g}^b)\). The advantage of \(\mathcal A\) is defined as \(|Pr[\mathcal{A} (y, e(g,\hat{g})^{abc}) =1] - Pr[\mathcal{A}(y,Z)=1]|\), where the probability is over the randomly chosen \(g\leftarrow \mathbb {G}\),\(\hat{g}\leftarrow \hat{\mathbb {G}}\), abc, and the random bits consumed by \(\mathcal A\).

Definition 5

Asymmetric Decisional q-Bilinear Diffie-Hellman Exponent (q-BDHE) Assumption [6] is that all PPT algorithms \(\mathcal A\) have an advantage negligible in \(\lambda \) of distinguishing \(e(g,\hat{g})^{a^{q+1}b} \in \mathbb {G}_T\) from a random element in \(\mathbb {G}_T\) by given the vector

$$ {\begin{matrix} y={g, g^b, g^a,g^{a^2},...,g^{a^q},g^{a^{q+2}}, \hat{g},\hat{g}^a,\hat{g}^{a^2},...,\hat{g}^{a^q},\hat{g}^{a^{q+2}}...,\hat{g}^{a^{2q}},T} \end{matrix}} $$

The advantage of \(\mathcal A\) is defined as \(|Pr[\mathcal{A} (y, e(g,\hat{g})^{a^{q+1}b}) =1] - Pr[\mathcal{A}(y,T)=1]|\), where the probability is over the randomly chosen ab, and the generator g,\(\hat{g}\), and the random bits consumed by \(\mathcal A\).

Definition 6

Asymmetric Decisional Linear Assumption [7] is that all PPT algorithms \(\mathcal A\) have an advantage negligible in \(\lambda \) of distinguishing \(Z=g^{x_3+x_4} \in \mathbb {G}\) from a random element in \(\mathbb {G}\) by given the vector \(y=\{g\), \(g^{x_1}\), \(g^{x_2}\), \(g^{x_1x_3}\), \(g^{x_2x_4}\), \(\hat{g}\), \(\hat{g^{x_1}}\),\(\hat{g^{x_2}}\}\). The advantage of \(\mathcal A\) is defined as \(|Pr[\mathcal{A} (y, g^{x_3+x_4} =1] -\) \( Pr[\mathcal{A}(y, Z)=1]|\), where the probability is over the randomly chosen \(x_1,x_2,x_3,\) \(x_4\in \mathbb {Z}_p\), and the random bits consumed by \(\mathcal A\). We remark that the elements \(\hat{g},\hat{g^{x_1}},\hat{g^{x_2}}\) were not explicitly included in Boenh’s et al. original formulation.

3.3 Building Blocks

Definition 7

Access Structure [31]. Let \(\{P_1,...,P_n\}\) be a set of parties. A collection \(\mathbb {A} \subseteq 2^{\{P_1,...,P_n\}}\) is monotone if \(\forall B, C\): \(B \in \mathbb {A}\) and \(B \subseteq C\), then \(C \in \mathbb {A}\). An access structure (respectively, monotone access structure) is a collection (respectively, monotone collection) \(\mathbb {A}\) of non-empty subsets of \(\{P_1,...,P_n\}\), i.e., \(\mathbb {A} \subseteq 2^{\{P_1,...,P_n\}} \setminus \{\}\). The set in \(\mathbb {A}\) are called the authorized sets, and the sets not in \(\mathbb {A}\) are called the unauthorized sets.

Note in our setting keywords will play the role of parties and we only consider the monotone access structures, and the negation of a keyword is regarded as a separate keyword.

Definition 8

Linear Secret-Sharing Schemes (LSSS) [31]. A secret sharing scheme \(\varPi \) over a set of parties P is called linear (over \(\mathbb {Z}_p\)) if

  1. 1.

    The shares for each party form a vector over \(\mathbb {Z}_p\).

  2. 2.

    There exists a matrix \(\mathbb {M}\) called the share-generating matrix for \(\varPi \). The matrix \(\mathbb {M}\) has l rows and n columns. For all \(i=1, ...., l\), the ith row of \(\mathbb {M}\) is labeled by a party \(\rho (i)\) (\(\rho \) is a function from \(\{1,...,l\}\) to P). When we consider the column vector \(v=(\alpha ,r_2,...,r_n)\), where \(\alpha \in \mathbb {Z}_p\) is the secret to be shared and \(r_2,...,r_n \in \mathbb {Z}_p\) are randomly chosen, then \(\mathbb {M}{} \mathbf v \) is the vector of l shares of the secret \(\alpha \) according to \(\varPi \). The share \((\mathbb {M}{} \mathbf v )_i\) belongs to party \(\rho (i)\).

The linear reconstruction property: let \(\varPi \) be an LSSS for access structure \(\mathbb {A}\), \({{\varvec{W}}}\) denote an authorized set, and define \(I \subseteq \{1,...,l\}\) as \(I=\{i| \rho (i) \in {{\varvec{W}}}\}\). The vector (1, 0, ..., 0) is in the span of rows of \(\mathbb {M}\) indexed by I, and there exist constants \(\{\omega _i \in \mathbb {Z}_p\}_{i\in I}\) such that, for any valid shares \(\{\lambda _i\}\) of a secret \(\alpha \) according to \(\varPi \), we have \(\sum _{i\in I} \omega _i\lambda _i=\alpha \). These constants \(\{\omega _i\}\) can be found in time polynomial in the size of share-generating matrix \(\mathbb {M}\). But for unauthorized sets of rows I, the target vector is not in the span of the rows of the set I. Moreover, there will exists a vector \(\omega \), such that \(\omega \cdot (1,0,...,0)=-1\) and \(\omega \cdot \mathbb {M}_i=0\) for all \(i \in I\).

Definition 9

Target Collision Resistant Hash Function [13]. A TCR hash function H guarantees that given a random element x which is from the valid domain of H, a PPT adversary A cannot find \(y\ne x\) such that \(H(x)=H(y)\). We let \(Adv_{H,A}^{TCR}=Pr[(x,y)\leftarrow \mathcal{A}(1^{k}): H(x)=H(y), x\ne y, x, y \in DH]\) be the advantage of A in successfully finding collisions from a TCR hash function H, where DH is the valid input domain of H, k is the security parameter. If a hash function is chosen from a TCR hash function family, \(Adv_{H,A}^{TCR}\) is negligible.

4 A New KP-ABSE

4.1 Construction

  • \(\mathbf{Setup }(1^\lambda ) \rightarrow (pars,msk)\). The setup algorithm takes as input a security parameter \(1^\lambda \). It chooses bilinear groups \(\mathbb {G},\hat{\mathbb {G}}\) of prime order p with generators \(g,\hat{g}\), respectively. It symmetrically random chooses \(u,h,\delta \in \mathbb {G}, \hat{u},\hat{h},\hat{\delta }\in \hat{\mathbb {G}}\) and \(\alpha ,d_1,d_2,d_3,d_4 \in \mathbb {Z}^*_p\). It then sets \(g_1=g^{d_1},g_2=g^{d_2},g_3=g^{d_3},g_4=g^{d_4}\). It also chooses a collision-resistant hash function H that maps group elements in \(\mathbb {G}_T\) to group elements in \(\mathbb {G}\). The public parameters pars and the master secret key msk are given by

    $$pars=(H,g,u,h,\delta ,\hat{u},g_1,g_2,g_3,g_4,e(g,\hat{g})^{\alpha }),$$
    $$msk=(\alpha ,\hat{g},\hat{h},\hat{\delta },d_1,d_2,d_3,d_4).$$
  • \(\mathbf{sKeyGen }(pars) \rightarrow (pk_s,sk_s)\). The algorithm takes as input the public parameter pars. It randomly chooses \(\kappa \in \mathbb {Z}^*_p\) and outputs the public and private key pair \((pk_s,sk_s)=(g^{\kappa },\kappa )\) for the cloud server.

  • \(\mathbf{Trapdoor }(pars\), \(pk_s\), msk, \(\mathbb {A}=(\mathbb {M}\), \(\rho \),\(\mathcal {T})) \rightarrow T_{\mathbb {M},\rho }\). The algorithm takes as input the public parameter pars, the server public key \(pk_s\), the master private key msk and an LSSS access structure \((\mathbb {M},\rho ,\mathcal {T})\), where \(\mathbb {M}\) is \(l\times n\) share-generating matrix, \(\rho \) is a map from each row of \(\mathbb {M}\) to an attribute name, \(\mathcal {T}=(z_{\rho (1)},...,z_{\rho (l)})\) and \(z_{\rho (i)}\) is the value of keyword name \(\rho (i)\) specified by the access formula. It randomly chooses a vector \(\mathbf {v}=(\alpha ,y_2,...,y_n)\in \mathbb {Z}_p^{n}\), and computes \(\lambda _i=\mathbf {v}\cdot \mathbb {M}_i\) for each \(i=[l]\). Let \(Q_i\) denote the set \([n] \setminus \{\rho (i)\}\) for each \(i\in [l]\). For each row \(\mathbb {M}_i\) of \(\mathbb {M}\), it chooses random \(r,r',t_{1,1},t_{1,2},...,t_{l,1},t_{l,2}\in \mathbb {Z}_p\), computes \(D=g^r,\hat{D}=\hat{g}^{r'}\), and outputs the trapdoor as \(T_{\mathbb {M},\rho }=((\mathbb {M}\),\(\rho )\), D, \(\hat{D}\),\(\{D_i,R_i,T_{i,1},T_{i,2},T_{i,3},T_{i,4},\{Q_{i,j},Q'_{i,j},Q''_{i,j},Q'''_{i,j}\}_{j\in Q_i}\}_{i\in [l]})\)

    $$D_i=\hat{g}^{\lambda _i}\hat{\delta }^{d_1d_2t_{i,1}+d_3d_4t_{i,2}},R_i=H(e(pk_s,\hat{D})^{r})\cdot \hat{g}^{d_1d_2t_{i,1}+d_3d_4t_{i,2}},$$
    $$T_{i,1}=(\hat{u}^{z_{\rho (i)}}\hat{h})^{-d_2t_{i,1}},Q_{i,j}=(\hat{u}^{z_j})^{-d_2t_{i,1}};$$
    $$T_{i,2}=(\hat{u}^{z_{\rho (i)}}\hat{h})^{-d_1t_{i,1}},Q'_{i,j}=(\hat{u}^{z_j})^{-d_1t_{i,1}};$$
    $$T_{i,3}=(\hat{u}^{z_{\rho (i)}}\hat{h})^{-d_4t_{i,2}},Q''_{i,j}=(\hat{u}^{z_j})^{-d_4t_{i,2}};$$
    $$T_{i,4}=(\hat{u}^{z_{\rho (i)}}\hat{h})^{-d_3t_{i,2}},Q'''_{i,j}=(\hat{u}^{z_j})^{-d_3t_{i,2}}.$$
  • \(\mathbf{Encrypt }(pars,{{\varvec{W}}}=(\mathrm{w_1,...,w_n})) \rightarrow CT\). The algorithm takes as input the public parameter pars and a keyword set \({{\varvec{W}}}\) (each keyword is denoted as keyword name and keyword value, i is the generic keyword name and \(\mathrm {w_i}\) is the corresponding keyword value), where \(\mathrm {w_1,...,w_n}\in \mathbb {Z}_p\) are the values of \({{\varvec{W}}}\). It chooses random \(\mu ,s,s_1,s_2\in \mathbb {Z}_p\), and outputs a ciphertext \(CT=(C,C',C'',E_1,E_2,E_3,E_4)\) as

    $$C=e(g,\hat{g})^{\alpha \mu },C'=g^{\mu },C''=\delta ^{-\mu }(h\prod _{i=1}^{n}u^\mathrm{{w_i}})^s$$
    $$E_1=g_1^{s-s_1},E_2=g_2^{s_1},E_3=g_3^{s-s_2},E_4=g_4^{s_2}.$$
  • \(\mathbf{Test }(pars,sk_s,CT,T_{\mathbb {M},\rho }) \). The algorithm takes as input the public parameter pars, the server private key \(sk_s\), a ciphertext \(CT=(C,C',C'',E_1,E_2,E_3,E_4)\) on a keyword set \({{\varvec{W}}}\) and a trapdoor \(T_{\mathbb {M},\rho }\) associated with an access structure \(\mathbb {A}=(\mathbb {M},\rho ,\mathcal {T})\). If the keyword set \({{\varvec{W}}}\) does not satisfy \(\mathbb {A}\), output \(\perp \). Otherwise, if the keyword set \({{\varvec{W}}}\) satisfies \(\mathbb {A}\), the test algorithm first finds \(\mathcal {I}\subseteq [1,l]\) and constants \(\{\omega _i\}_{i\in \mathcal {I}}\in \mathbb {Z}_p\) such that \(\sum _{i\in \mathcal {I}} \omega _i \mathbb {M}_i=(1,0,...,0)\) and \(\mathrm {w}_{\rho (i)}=z_{\rho (i)}\) for \(\forall i\in \mathcal {I}\). The algorithm then does as follows:

    1. (1)

      Pre-processing step on the private key

      Let \(Q_i\) denote the set \([n] \setminus \{\rho (i)\}\) for each \(i\in \mathcal {I}\). Note that if \(j\in Q_{i}\), then \(j\ne \rho (i)\). Since for each \(i\in \mathcal {I}\), \(\mathrm {w}_{\rho (i)}=z_{\rho (i)}\), then we have

      $$\hat{T}_{i,1}=T_{i,1}\prod _{j\in Q_i}Q_{i,j}^\mathrm{{w}_j}=(\hat{h}{\prod _{j=1}^{n}}\hat{u}^\mathrm{{w}_j})^{-d_2t_{i,1}},$$
      $$\hat{T}_{i,2}=T_{i,2}\prod _{j\in Q_i}(Q'_{i,j})^\mathrm{{w}_j}=(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_1t_{i,1}},$$
      $$\hat{T}_{i,3}=T_{i,3}\prod _{j\in Q_i}(Q''_{i,j})^\mathrm{{w}_j}=(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_4t_{i,2}},$$
      $$\hat{T}_{i,4}=T_{i,4}\prod _{j\in Q_i}(Q'''_{i,j})^\mathrm{{w}_j}=(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_3t_{i,2}},$$
    2. (2)

      \(I_{\mathbb {M},\rho }\) is a set of minimum subsets satisfied \((\mathbb {M},\rho )\), it then checks whether there is an \(\mathcal {I}\in I_{\mathbb {M},\rho }\) statisfying

      $$\begin{aligned}&e(C',\prod _{i\in \mathcal {I}}D_i^{\omega _i})e(C'',\prod _{i\in \mathcal {I}}(\frac{R_i}{H_2(e(D,\hat{D})^{\kappa })})^{\omega _i})e(E_1,\prod _{i\in \mathcal {I}}(\hat{T}_{i,1})^{\omega _i})e(E_2,\prod _{i\in \mathcal {I}}(\hat{T}_{i,2})^{\omega _i})\\&\qquad \qquad \qquad \cdot e(E_3,\prod _{i\in \mathcal {I}}(\hat{T}_{i,3})^{\omega _i})e(E_4,\prod _{i\in \mathcal {I}}(\hat{T}_{i,4})^{\omega _i})\\&= e(g^{\mu },\prod _{i\in \mathcal {I}}(\hat{g}^{\lambda _i}\hat{\delta }^{d_1d_2t_{i,1}+d_3d_4t_{i,2}})^{\omega _i})\cdot e(\delta ^{-\mu }(h\prod _{i=1}^{n}u^\mathrm{{w}_i})^s,\prod _{i\in \mathcal {I}}(\hat{g}^{d_1d_2t_{i,1}+d_3d_4t_{i,2}})^{\omega _i})\\&\qquad \qquad e(g_1^{s-s_1},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_2t_{i,1}w_i})e(g_2^{s_1},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_1t_{i,1}w_i})\\&\qquad \qquad \cdot e(g_3^{s-s_2},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_4t_{i,2}w_i})e(g_4^{s_2},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_3t_{i,2}w_i})\\&= e(g^{\mu },\prod _{i\in \mathcal {I}}\hat{g}^{\lambda _i\omega _i})e(g^{\mu },\prod _{i\in \mathcal {I}}(\hat{\delta }^{d_1d_2t_{i,1}+d_3d_4t_{i,2}})^{\omega _i})\\&\qquad \qquad \cdot e({\delta }^{-\mu },\prod _{i\in \mathcal {I}}(\hat{g}^{d_1d_2t_{i,1}+d_3d_4t_{i,2}})^{\omega _i}){e((h\prod _{i=1}^{n}u^\mathrm{{w}_i})^s,\prod _{i\in \mathcal {I}}\hat{g}^{(d_1d_2t_{i,1}+ d_3d_4t_{i,2})\omega _i})}\\&\qquad e(g^{sd_1},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_2t_{i,1}w_i})e(g^{-d_1s_1},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_2t_{i,1}w_i})e(g^{d_2s_1},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_1t_{i,1}w_i})\\&\qquad e(g^{sd_3},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_4t_{i,2}w_i})e(g^{-d_3s_2},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_4t_{i,2}w_i})e(g^{d_4s_2},(\hat{h}\prod _{j=1}^{n}\hat{u}^\mathrm{{w}_j})^{-d_3t_{i,2}w_i})\\&= e(g,\hat{g})^{\alpha \mu }=C \end{aligned}$$

4.2 Security Proof

Theorem 1

Under the asymmetric decisional DBDH assumption, the asymmetric decisional q-BDHE assumption and the asymmetric decisional linear assumption, our scheme is selectively indistinguishable against chosen keyword-set attacks (selectively IND-CKA).

Proof

The proof is divided into two parts, depending on the role of the adversary. In the first part, the adversary is assumed to be an outside attacker, and in the second part, the adversary is assumed to be the cloud sever who performs search operations. The proof details will be given in the full version of the paper due to space limit.

5 Comparison

To specifically highlight the contributions of our research work, we compare our scheme with three related works, namely [14, 28, 48]. Lai et al. [28] is an expressive searchable encryption protocol built in composite order group, while [14, 48] are expressive searchable encryption schemes with prime order group. Below, we compare the above schemes in terms of communication cost, computation cost, features and security. In [48], S is the number of the data user’s attributes, and N is the number of attributes that are involved in the data owner’s access control policy. Let |par|, |msk|, \(|T_{M,\rho }|\), |M| be the size of the public parameter, the master private key, the trapdoor and the access structure, respectively. We let \(|\mathbb {G}|\), \(|\mathbb {\hat{G}}|\), and \(|\mathbb {G}_T|\) denote the size of the element in \(\mathbb {G},\mathbb {\hat{G}},\mathbb {G}_T\), respectively. Let l be the number of keywords in an access structure, n be the maximum number of keywords in the system, and m be the size of a keyword set associated with a ciphertext. Denote E as an exponentiation operation, P as a pairing operation, \(x_1\) as the number of elements in \(I_{\mathbb {M}, \rho }=\{\mathcal {I}_1,...,\mathcal {I}_{x_1}\}\), \(x_2\) as \(|\mathcal {I}_1|+\cdots +|\mathcal {I}_{x_1}|\).

Table 1. Storage and communication overhead comparison

From Table 1, it can be seen that only [28] is built on composite order group, suffering from the heaviest communication cost (with linear cost in all metrics), while others are in prime order group. According to [22], prime order group have clear advantage in the parameter size over composite order group pairing-friendly elliptic curves. Although being constructed in prime order group, [14, 48] come at O(S) and O(m) price in ciphertext storage/communication. However, ours only requires constant value in the same metric. The reason behind the constant cost (in our construction) relies on the “aggregation” of ciphertext components, aggregating keyword set as a whole (much like some technique used in hierarchical IBE). We note that the size of trapdoor in our scheme is bound at O(nl). This seems as a trade-off between reducing the cost in ciphertext and (meanwhile) enlarging the size of trapdoor. However, we here state that increasing the size of trapdoor does bring efficiency in test phase. We will discuss this in the next paragraph.

Table 2. Computation cost comparison

Table 2 shows that our scheme only requires constant pairing cost (7P) in test phase, while others are restricted to linear pairing cost. In the same metric, the exponentiation cost of our scheme maintains the same magnitude as that of others. Except [48], pairing computation exists in the encryption phase of all other schemes. Compared to [28] (with composite order group), our scheme may enjoy around 50 times faster in pairings (if [28] equips a 1024-bit composite order elliptic curve) [19]. The decryption techniques used in [14, 48] drag down the efficiency of decryption. This is so because the pairings mainly depend on the size of attribute set, in particular, an attribute needs one pairing computation. However, we employ “fast decryption” technology and auxiliary components \(Q_{i,j}\) into our construction, so that the test algorithm are free of linear cost, namely, the efficiency of the test algorithm is not restricted to the size of attribute set.

Table 3. Property and security comparison

We show the feature and security comparison in Table 3. We use KGA to denote keyword guessing attacks. It is clear to see that our scheme supports any monotonic assess structure while others only provide AND and OR level of expressiveness. Enjoying more expressiveness, our scheme maintains the same security level with Cui et al.’s scheme [14]. Zheng et al. [48] opted to use an authenticated private channel to eliminate the keyword guessing attacks. However, it may not be scalable in practice. To enable publicly trapdoor delivery, [14] and our schemes slightly degrade the keyword privacy level to only allow a designated server to launch KGA. We state that our scheme is the first of its type, in the literature, to provide security and expressiveness simultaneously without significantly jeopardizing the efficiency. It is worthy of mentioning that the generation/computation cost of trapdoor (in our scheme) can be further off-loaded to the a trusted party holding the master private key (because of our sophisticated construction technique), so that system user can enjoy lighter computation complexity.

6 Conclusions

Attribute-based keyword search has attracted many attentions since it can support secure search over encrypted data with expressive access structure. Nevertheless, the size of ciphertexts but also the pairing cost (incurred in the test phase) are linear in the number of keyword. That is the main drawback of the most of the existing searchable encryption systems with ABE. To tackle the above opened problem, we propose a new KP-ABES scheme with outstanding features, namely expressive access structures, constant size ciphertext, and constant pairing cost (in search). There are some interesting open problems brought by this research work as well, for example, how to reduce the size of search trapdoor, and how to renew/provoke attribute.