1 Introduction

Typical Internet of Things (IoT) deployments include hardware technologies, sensing technologies (e.g., radio frequency identification and sensors), actuators, and other smart communication devices that are connected to the Internet. These technologies and equipment facilitate the extensive collection and exchange of information, files, and other real-time content that are shared between more and more smart terminals [1]. According to a report from the International Data Corporation , nearly 28 billion IoT devices will be installed by 2020, and the global economic impact of the IoT is estimated to be 2 trillion [2].

Given the proliferation of shared data and the expanding scale of the IoT, it is very worthwhile to increase throughput in such a large distributed network. The initial motivation of network coding was to improve the throughput of decentralized networks. In fact, this technology is considered to be a good approach to improve the distribution and sharing of digital content in peer-to-peer streaming networks and wireless ad hoc networks [3].

More specifically, unlike the traditional store-and-forward mechanism, in network coding, before the source node transmits a message (file) to the target node, it first divides the file into m packets, and then sends them to the neighboring nodes, thereby allowing the intermediate node (or router) to modify the received data packets and forward them. In linear network coding, the coding packets are regarded as vectors in linear space over some field. The intermediate node calculates the linear combination of these vectors by choosing random coefficients. If the target node receives a certain number of correct data packets, it can recover the original information with high probability. As this technology can optimize network throughput [4, 5], reduce energy consumption, and improve routing reliability [6], it is important to apply network coding technology in the IoT.

Because IoT devices typically interact with third-party applications, in an IoT system with network coding deployed, an important concern is preventing third-party applications from maliciously modifying data packets, that is, pollution attacks. Specifically, network coding allows nodes to mix data packets to make them more vulnerable to pollution attacks. Errors introduced in only one packet can propagate and generate more invalid packets, which causes them to flow to their destination. A simplified version of the network coding without pollution attack and under pollution attack can be seen in Fig. 1, in which S is the source node, I1, I2, I3, I4, I5 are the intermediate nodes, and D1, D2 are destination nodes. According to the Fig. 1b, the intermediate node I2 transmits a invalid packet v2′, which affects the outputs of I4 and I5. Thus, the adversary can prevents file reconstruction by maliciously modifying only a small number of packets or injecting invalid packets.

Fig. 1
figure 1

Network coding and pollution attack on network coding

To solve this problem, two main solutions have been proposed: information theoretic and cryptographic approaches. For the information theoretic approach, redundancy is introduced into the original package to recover the original files from malicious failures; however, the existing scheme can only passively tolerate the pollution attack at the destination. By contrast, the cryptographic approach does not restrict the adversary’s behavior, and the intermediate node can detect and discard invalid data packets in the process of transmission, which can effectively mitigate pollution attacks. Therefore, in recent years, cryptographic solutions have attracted the attention of many scholars. They are divided into public key methods (e.g., homomorphic signature [7,8,9,10]) and symmetric key methods (e.g., homomorphic MAC [11,12,13,14,15]). The public key method avoids the problem of key distribution and is very suitable for a network coding environment where senders send multiple files to multiple receivers. In this paper, we focus on homomorphic signatures in public key methods. The main idea is to provide an approach to verify valid vectors. As shown in Fig. 2, after the source node outputs a properly augmented basis and the signatures of the basis, the intermediate node will verify the validity of all the signatures received. If a vector fails to pass the verification, the intermediate node will discard the invalid vector, calculate the linear combination of the remaining valid vectors, and generate a valid signature for the linear combination without the signer’s secret key. Finally, the destination node will recover the original file from m linearly independent vectors.

Fig. 2
figure 2

The principle of homomorphic signature against pollution attack in network coding

In public key infrastructure, deployment costs are high and management certificates are tricky, particularly in resource-constrained environments, such as the IoT. To mitigate this issue, [16] introduced the concept of identity-based public key cryptography (ID-PKC). The concept is to use the user’s identity information (e.g., IP address, driver’s license number, and e-mail address) as a public key, and then the trusted key generation center (KGC) is responsible for generating private keys for users. Although ID-PKC simplifies certificate management, it introduces the problem of key escrow. Once the KGC is destroyed, the user’s private key will be completely disclosed, making it unsuitable for large-scale network environments.

Al Riyami et al. [28] proposed a certificateless public key cryptosystem (CL-PKC) in which the user’s private key is composed of some contributions of KGC, that is, the partial private key and a secret value chosen by the user. Thus, the CL-PKC eliminates the key escrow problem inherent in ID-PKC, while retaining the certificateless property. For different applications, many researchers have proposed encryption schemes [[29,30,31] and signature schemes [32,33,34,35] based on CL-PKC. However, to date, almost all proposed linearly homomorphic signature (LHS) schemes have been based on either public key infrastructure [17,18,19,20,21,22,23] or identity cryptography [24,25,26,27], no construction has been proposed to support both a homomorphic network coding signature and certificateless characteristic. Therefore, to fill this gap in the literature, in this paper we design a certificateless LHS (CL-LHS) scheme for network coding. We prove that our homomorphic signature scheme is unforgettable even in the presence of type I and type II adversaries.

1.1 Our contributions

We summarize our main contributions as follows:

  • We introduce the concept of a CL-LHS scheme for network coding, which addresses the issues of certificate management and key escrow while defending against pollution attacks.

  • We present a security model for CL-LHS to guarantee the functionality and security of the proposed CL-LHS scheme, which considers two types of adversaries (Type I adversary and Type II adversary) that are capable of forging two types of signatures (Type 1 forgery and Type 2 forgery).

  • We construct a concrete CL-LHS scheme and prove that the proposed scheme is secure against an adaptively chosen dataset attack in a random oracle model under the two types of adversaries.

  • Compared with related LHS schemes, our CL-LHS scheme has a smaller key size and a shorter signature length and has comparable computation costs. By making the LHS scheme certificateless, our CL-LHS scheme can be deployed and implemented in an IoT environment with limited computing power and storage space.

1.2 Related works

In network coding, intermediate nodes (or routers) are allowed to combine and retransmit received data packets, and the recipient can still obtain the original data. This technology can maximize network throughput and increase robustness. Aiming at addressing the problem that linear network coding is vulnerable to malicious node pollution attacks, a solution based on computational assumptions and cryptographic technology is considered. The main idea here is to provide a method to verify valid vectors by using the network coding signature scheme. These schemes can be constructed from homomorphic hash functions or homomorphic signatures (HSs). Krohn et al. introduced a homomorphic hash function [36] to construct a network coding signature scheme. The main disadvantage of this method is that the authentication information and public key that must be sent with the package are very large, which is not conducive to improving throughput. Using LHS to perform network coding authentication is a more effective method. The work of Boneh et al. [8] is a milestone in LHS. In fact, it is considered the first to provide a practical framework for such a scheme. Attrapadung and Libert [37] showed that the first LHS scheme was secure under the standard model. The earliest RSA-based HS scheme was proposed by Gennaro et al. in 2010 [38], and it is proven that the scheme is unforgeable against a weak adversary in the random oracle model. Boneh and Freeman [39] presents the first scheme that can resist quantum attack, and the hardness assumption exploited is k-SIS. The above schemes are certificate-based cryptosystems. For fine-grained access control, identity-based HS schemes were proposed [24,25,26,27], which were proven to be secure in a random oracle model. Previous schemes allowed linear functions to be computed over signed data, while the scheme in [40] can evaluate multivariate polynomials, and [41, 42] proposed fully HS schemes supporting arbitrary functions.

To further extend the utility of HS, multiple-key HS has recently received attention [20,21,22, 48]. Multiple-key support is necessary for datasets that involve inputs authenticated by different clients, for example, in a distributed network of sensors. Prior to [48], the concept of homomorphic authentication studied only supported executions of computations over data authenticated by a single user. In 2019, Schabhüser et al. proposed the first perfectly context-hiding multiple-key linearly homomorphic authenticator scheme [22]. Lai et al. proposed a generic construction of multiple-key HS with unforgeability under corruption [21].

HS with additional functions applied to specific scenarios is also a popular topic in current research. Quantum-based protocols are being used in homomorphic signature schemes to address quantum network environments. Shang et al. in 2015 treated entanglement swapping as a homomorphic operation and creatively proposed the first quantum HS scheme [43]. However, this scheme only allows one verifier to verify a signature once. To support repeatable verification for general scenarios, Shang et al. proposed a new quantum HS scheme with repeatable verification by using a serial verification model and parallel verification model [44]. Li et al., in 2019, proposed two quantum homomorphic message authentication schemes based on quantum circuits, which can resist pollution attacks initiated by untrusted inside nodes over a general quantum network [45]. In addition, the verifiably encrypted HS scheme proposed by Seo et al. [46] and homomorphic signcryption scheme proposed by Fan et al. [47] have been successfully applied to accumulable optimistic fair exchange and electronic voting, respectively (Table 1).

Table 1 Symbol description

1.3 Organization

The rest of this paper is organized as follows: In Section 2, we present preliminaries, including basic concepts of network coding and complex assumptions. In Section 3, we introduce the notion of the CL-LHS scheme for network coding and give its security model. We construct a concrete CL-LHS scheme and prove its security in Sections 4 and 5 respectively. We present the efficiency comparison in Section 6. Finally, we conclude the paper in Section 7.

2 Preliminaries

2.1 Linear network coding

In the network coding model, three stages are executed to complete the file transmission:

  • The file to be transferred is treated as a set of n-dimensional vectors \(\boldsymbol {\bar {v}}_{1}, \cdots , \boldsymbol {\bar {v}}_{m} \in \mathbb {F}^{n}_{p}\), where p is a prime number. Before transmission, the source node augments each of them as

    $$ \boldsymbol{v}_{i}=(\boldsymbol{\bar{v}}_{i}, \overbrace{\underbrace{0, \cdots, 0, 1}_{i}, 0, \cdots, 0}^{m} )\in \mathbb{F}^{n+m}_{p}, $$

    In this way, the vectors v1, ⋯ , vm form the basis of a subspace \( V \subset {\mathbb {F}^{n+m}_{p}}\), which is called a properly augmented basis.

  • Upon receiving the packets (i.e., vectors) \(\boldsymbol {{w}}_{1}, \cdots , \boldsymbol {{w}}_{l} \in \mathbb {F}^{n+m}_{p} \) on its incoming edges, an intermediate node computes a linear combination, namely, the vector \( \boldsymbol {w}={\sum }_{i=1}^{l}{c_{i}\boldsymbol {w}_{i}}\), for \(c_{i}\overset {\$}{\leftarrow }\mathbb {F}_{p} \). Then vector w is transmitted on its outgoing edges.

  • To recover the original file, a destination node (i.e., receiver) must receive m linearly independent vectors w1, ⋯ , wm of the form \(\boldsymbol {w}_{i}=(\boldsymbol {w}^{L}_{i}, \boldsymbol {w}^{R}_{i})\), where \(\boldsymbol {w}^{L}_{i} (\boldsymbol {w}^{R}_{i})\) denotes the left-most n (right-most m) positions of the vector. The receiver then computes an m × m matrix G such that

    $$ G=\begin{pmatrix} \boldsymbol{w}^{R}_{1} \\ {\vdots} \\ \boldsymbol{w}^{R}_{m} \end{pmatrix}^{-1} \\ $$

    Finally, the original file can be recovered by computing

    $$ \begin{pmatrix} \boldsymbol{\bar{v}}_{1} \\ {\vdots} \\ \boldsymbol{\bar{v}}_{m} \end{pmatrix}=G \cdot \begin{pmatrix} \boldsymbol{w}^{L}_{1} \\ {\vdots} \\ \boldsymbol{w}^{L}_{m} \end{pmatrix}. $$

2.2 Bilinear pairing

Let \((\mathbb {G}_{1}, \mathbb {G}_{2})\) be two cyclic groups with the same order p and let \( e:\mathbb {G}_{1}\times \mathbb {G}_{1}\rightarrow \mathbb {G}_{2} \) be a map; e is a bilinear pairing if it has the following three properties:

  1. (1)

    Bilinearity: For any a, b\( \mathbb {Z}^{*}_{p} \) and g, h\( \mathbb {G}_{1} \), e(ga, hb) = e(g, h)ab.

  2. (2)

    Non-degeneracy: There exist g, h\( \mathbb {G}_{1} \), such that e(g, h) ≠  1.

  3. (3)

    Computability: For any g, h\( \mathbb {G}_{1}\), there is an efficient algorithm to compute e(g, h).

2.3 Computational Bilinear Diffie-Hellman problem

Definition 1

(CDH Problem) Let \( e:\mathbb {G}_{1}\times \mathbb {G}_{1}\rightarrow \mathbb {G}_{2} \) be a bilinear pairing. Given (g, ga, gb), where \(a, b\in \mathbb {Z}^{*}_{p}\) are unknown numbers, compute the value of gab.

3 Definitions and security model

3.1 Certificateless linearly homomorphic signature

Definition 2

(Certificateless Linearly Homomorphic Signature Scheme) A certificateless linearly homomorphic signature scheme consists of a tuple of (probabilistic) polynomial-time algorithms (Setup, Extract-Partial-Private-Key, Set-Secret-Value, Set-Private-Key, Set-Public-Key, CL-HSign, CL-Combine, CL-Verify) with the following functionality:

  • Setup (1k, N, m): When a security parameter 1k and two integers N, m ≥ 1 are input, this algorithm outputs the system parameters params and a master key msk.

  • Extract-Partial-Private-Key (params, ID, msk ): This algorithm takes as input msk and a user’s identity ID and outputs the user’s partial private key DID.

  • Set-Secret-Value (params, ID): This algorithm takes as input params and a user’s identity ID and outputs the user’s secret value xID.

  • Set-Private-Key (params, DID, xID ): This algorithm takes as input params, the partial private key DID and the secret value xID, it generates a user’s full private key, referred to as SKID.

  • Set-Public-Key (params, xID ): This algorithm takes as input params and a secret value xID, and it generates the public key PKID of the identity ID.

  • CL-HSign (params, ID, SKID, τ, v): For the input params, a user’s identity ID, a full private key SKID, a file identifier τ ∈{0,1}k and a vector \(\boldsymbol {v} \in \mathbb {F}_{p}^{N}\), this algorithm outputs a signature σ.

  • CL-Combine (params, ID, PKID, τ, \(\{(c_{i}, \sigma _{i})\}_{i=1}^{l}\)): For the input params, a user’s identity ID, a public key PKID, a file identifier τ, and a set of tuples \(\{(c_{i}, \sigma _{i})\}_{i=1}^{l}\) with \( c_{i}\in \mathbb {F}_{p} \), where σi is a signature on the vector vi, this algorithm outputs a signature σ on the vector \(\boldsymbol {v}=\underset {i\in {[l]}}{\sum }{c_{i}\boldsymbol {v}_{i}}\).

  • CL-Verify (params, ID, PKID, τ, y, σ): For the input params, a user’s identity ID, a public key PKID, a file identifier τ, a vector \(\boldsymbol {y}\in \mathbb {F}_{p}^{N}\) and a signature σ, this algorithm returns either 1 (accept) or 0 (reject).

Setup and Extract-Partial-Private-Key are assumed to be run by the KGC. Once a partial private key generated by the KGC is given to a user via a secure channel, the user performs the Set-Secret-Value algorithm.

Correctness. We require that for each key pair (SKID, PKID) output by Setup, Set-Private-Key, Set-Public-Key, the following hold:

  1. (1)

    For all τ ∈{0, 1}k and \(\boldsymbol {v} \in \mathbb {F}_{p}^{N}\), if \(\sigma \leftarrow \) CI-HSign(ID, SKID, τ, v), then CL-Verify (ID, PKID, τ, v, σ)= 1.

  2. (2)

    For all τ and all sets of triples \(\{(c_{i}, \sigma _{i},\boldsymbol {v}_{i})\}_{i=1}^{l} \), if CL-Verify(ID, PKID, τ, vi, σi)= 1 for each i ∈ [l], then CL-Verify \((ID, PK_{ID}, \tau , {\sum }_{i=1}^{l}{c_{i}\boldsymbol {v}_{i}}\), CL-Combine \(\{ID, \tau , (c_{i},\sigma _{i})\}_{i=1}^{l})\)= 1.

3.2 System models

Figure 3 shows the system model of the certificateless linearly homomorphic signature scheme for network coding-enabled IoT environments, which consists of three parts: a key generation center (KGC), an IoT device and receivers. The KGC is responsible for the system setup and the calculation of partial private keys for each IoT device. The KGC generates system parameters and sends them to all entities, and partial private keys are sent to each entity through a secure channel. IoT devices with limited computing power and storage space are able to collect data from the physical world. To ensure data integrity and authenticity, each IoT device processes the collected raw data before signing it and then outputs the data, signature, and public key. In the process of file transmission, the certificateless linearly homomorphic signature scheme resists pollution attacks on the network coding-enabled network. Finally, the receivers recover the initial file.

Fig. 3
figure 3

System model of the CL-LHS for network coding-enabled IoT environments

3.3 Security models

In the security model of certificateless linearly homomorphic signatures, two types of adversaries with different capabilities are considered.

Type I adversary :

(\( \mathcal {A}_{\mathcal {I}} \)): This type of adversary cannot access the master key of the system but is allowed to replace the public key of any entity with a value of his choice because of the uncertified nature of the user’s public key.

Type II adversary :

(\( \mathcal {A}_{\mathcal {I}\mathcal {I}} \)): This type of adversary can access the system’s master key but cannot initiate public key replacement attacks.

The unforgeability of the certificateless linearly homomorphic signature scheme against adaptively chosen dataset attacks can be characterized by the following two games between challengers and adversaries (\( \mathcal {A}_{\mathcal {I}} \) and \( \mathcal {A}_{\mathcal {I}\mathcal {I}} \)).

For adversaries in Game 1, we make the following restrictions:

  1. (1)

    The adversary cannot extract the full private key for the challenge identity ID.

  2. (2)

    The challenge identity ID cannot be one that has been replaced with a public key and had a partial private key extracted.

Game 1

In this game, \( \mathcal {A}_{\mathcal {I}} \) interacts with the challenger C.

  • Setup: The challenger C runs Setup (1k, N, m) to generate the system parameters params and master key msk. It then gives params to \( \mathcal {A}_{\mathcal {I}}\) while keeping msk secret.

  • Queries: Adversary \( \mathcal {A}_{\mathcal {I}} \) performs the following oracle queries but is subject to the above restrictions.

    Partial Private Key Extraction: Given an identity ID, the challenger computes the partial private key DID and returns it to \( \mathcal {A}_{\mathcal {I}} \).

    Secret V alue Extraction: Given a user’s identity ID, the challenger returns the user’s secret value xID to \( \mathcal {A}_{\mathcal {I}} \).

    Public Key Queries: On receiving such a query with an ID, the challenger computes the corresponding public key PKID and returns it to \( \mathcal {A}_{\mathcal {I}} \).

    Replace Public Key: \( \mathcal {A}_{\mathcal {I}} \) may replace a public key with a value chosen by him.

    Signing Queries: \( \mathcal {A}_{\mathcal {I}} \) issues a sequence of queries adaptively for the vector subspaces \(V_{i} \subset \mathbb {F}^{N}_{p}\). For each i, the challenger chooses an identifier τi uniformly from {0,1}k and returns τi and \(\sigma _{i}\leftarrow \) CL-HSign(ID, SKID, τi, Vi) to \( \mathcal {A}_{\mathcal {I}} \).

  • Output:\( \mathcal {A}_{\mathcal {I}} \) outputs an identifier τ, a nonzero vector \(\boldsymbol {v}^{*} \in \mathbb {F}^{N}_{p}\), and a signature σ corresponding to a challenge identity ID and a public key \(PK_{ID^{*}}\).

    The adversary wins if CL-Verify (\(ID^{*}, PK_{ID^{*}}\), τ, v, σ) = 1 and one of the following conditions is met:

    1. (1)

      ττi for all τi that appear in the signing queries (Type 1 forgery).

    2. (2)

      τ = τi for some i but vVi (Type 2 forgery).

    The advantage of \( \mathcal {A}_{\mathcal {I}} \) winning Game 1 is denoted as \(Adv_{\mathcal {A}_{\mathcal {I}}}^{CL-LHS}(k)\).

Game 2

We set the semi-trusted KGC as the adversary in this game.

  • Setup: The challenger generates the system parameter params and master key msk, then returns params and msk to \( \mathcal {A}_{\mathcal {I}\mathcal {I}}\).

  • Queries: Adversary \( \mathcal {A}_{\mathcal {I}\mathcal {I}} \) initiates a sequence of queries adaptively for polynomial-many times, including Secret V alue Extraction, Public Key Queries and Signing Queries. The queries /response method is the same as that in Game 1, except that the adversary is \( \mathcal {A}_{\mathcal {I}\mathcal {I}} \).

  • Output: \( \mathcal {A}_{\mathcal {I}\mathcal {I}} \) outputs an identifier τ, a nonzero vector \(\boldsymbol {v}^{*} \in \mathbb {F}^{N}_{p}\), and a signature σ corresponding to a challenge identity ID.

    The adversary wins if CL-Verify (\(ID^{*}, PK_{ID^{*}}\), τ, v, σ) = 1, ID has not been issued as a secret value extraction, and one of the following conditions is met:

    1. (1)

      ττi for all τi that appear in the signing queries (Type 1 forgery).

    2. (2)

      τ = τi for some i, but vVi (Type 2 forgery).

    The advantage of \( \mathcal {A}_{\mathcal {I}\mathcal {I}} \) winning Game 2 is denoted as \( Adv_{\mathcal {A}_{\mathcal {I}\mathcal {I}}}^{CL-LHS}(k)\).

Definition 3 (Unforgeability)

We say that a certificateless linearly homomorphic signature scheme is unforgeable against adaptively chosen dataset attacks for polynomial-time adversaries \( \mathcal {A}_{i} \) if \( Adv_{\mathcal {A}_{i}}^{CL-LHS} (i=I, II)\) in the above games is negligible

4 Proposed CL-LHS scheme

In this section, we describe the proposed certificateless linearly homomorphic signature scheme, which is composed of eight polynomial time algorithms.

  • Setup: Taking as input a security parameter 1k and two positive integers N, m, the algorithm performs the following steps:

    1. (1)

      Select two cyclic groups \( \mathbb {G}_{1} \), \( \mathbb {G}_{2} \) of the same prime order p, and choose a bilinear pairing \( e:\mathbb {G}_{1}\times \mathbb {G}_{1}\rightarrow \mathbb {G}_{2} \).

    2. (2)

      Choose a generator \( g \in \mathbb {G}_{1} \), select a random number \( s\in \mathbb {F}_{p}^{*} \) as the master key msk, and set Ppub = gs.

    3. (3)

      Choose four different cryptographic hash functions H, H1, H2 and H3, each of which maps {0,1} to \( \mathbb {G}_{1} \). Then, publish the system parameters params \( =(k, \mathbb {G}_{1}, \mathbb {G}_{2}, e, p, g, H, H_{1}, H_{2}, H_{3} ) \), and keep msk secret.

  • Extract-Partial-Private-Key: Taking as inputs params, the master key s, and a user’s identity ID ∈{0,1}, the KGC executes the following steps:

    1. (1)

      Compute QID = H(ID).

    2. (2)

      Output the partial private key DID = (QID)s.

  • Set-Secret-Value: Taking as input an identity ID, this algorithm chooses \( x_{ID}\in {\mathbb {F}^{*}_{p}} \) at random and sets xID as the user’s secret value.

  • Set-Private-key: Taking as input params and a user’s identity ID, this algorithm outputs the user’s full private key SKID = (DID, xID).

  • Set-Public-Key: Taking as input params and a user’s secret value xID, this algorithm generates the user’s public key \( PK_{ID}=g^{x_{ID}}\).

  • CL-Hsign: Assume that an initially empty list L has stored all previously returned identifiers τ with the related information (r, U) defined below. Taking as input a signer’s private key SKID = (DID, xID) and identity ID, an identifier τ ∈{0,1}k, and a vector \( \boldsymbol {v}=(v_{1}, \cdots , v_{N}) \in \mathbb {F}^{N}_{p}\), the algorithm responds as follows:

    1. (1)

      If τ appears in L, the algorithm recovers the associated (r, U) from L.

    2. (2)

      Otherwise, it selects \( r\in \mathbb {F}^{*}_{p}\) randomly, sets U = gr and stores (r, U) into L.

    Then, the algorithm computes

    $$ \begin{array}{@{}rcl@{}} T_{i}&=&H_{1} (ID, P_{pub}, \tau, U, i),\\ T^{\prime}_{i}&=&H_{2} (ID, PK_{ID}, \tau, i),\\ T&=&H_{3} (ID, PK_{ID}) , \\ W&=&(D_{ID})^{\underset{i\in{[N]}}{\sum}v_{i}}\left( \underset{i\in{[N]}}{\prod}T_{i}^{v_{i}}\right)^{r}\left( \underset{i\in{[N]}}{{\prod}}{T_{i}^{\prime}}^{v_{i}}\cdot T^{\underset{i\in{[N]}}{\sum}v_{i}}\right)^{x_{ID}} \end{array} $$

    and outputs the signature σ = (U, W).

  • CL-Combine: Given an identity ID and corresponding public key \( PK_{ID}=g^{x_{ID}} \), an identifier τ, and \(\{(c_{i}, \sigma _{i})\}_{i=1}^{l} \) with \( c_{i}\in {\mathbb {F}_{p}} \), where σi = (U, Wi), this algorithm computes \(W=\underset {i\in {[l]}}{\prod }W_{i}^{c_{i}} \) and outputs (U, W).

  • CL-Verify: Given an identity ID and corresponding public key \( PK_{ID}=g^{x_{ID}} \), an identifier τ, a signature σ = (U, W), and a vector \( \boldsymbol {v}=(v_{1}, \cdots , v_{N}) \in \mathbb {F}^{N}_{p}\), the algorithm accepts the signature if the following equation holds:

    $$ e(W, g)=e(Q_{ID}^{\underset{i\in{[N]}}{\sum}v_{i}}, P_{pub})\cdot e\left( \underset{i\in{[N]}}{\prod}T_{i}^{v_{i}}, U)\cdot e(\underset{i\in{[N]}}{\prod}{T_{i}^{\prime}}^{v_{i}} \cdot T^{\underset{i\in{[N]}}{\sum}v_{i}}, PK_{ID}\right) $$

Correctness Given an identity ID and public key PKID, an identifier τ, a vector \( \boldsymbol {v}=(v_{1}, \cdots , v_{N}) \in \mathbb {F}^{N}_{p}\), and a signature σ, if \(\sigma \leftarrow \)CL-HSign(ID, SKID, τ, v), the correctness of the scheme can be obtained by the following equation:

$$ \begin{array}{@{}rcl@{}} e(W, g)&=&e(D_{ID}^{\underset{i\in{[N]}}{\sum}v_{i}}, g)\cdot e\left( {}\left( \underset{i\in{[N]}}{\prod}T_{i}^{v_{i}}\right)^{r}, g\right)\cdot e\left( {}\left( \underset{i\in{[N]}}{\prod}{T_{i}^{\prime}}^{v_{i}} \cdot T^{\underset{i\in{[N]}}{\sum}v_{i}}\right)^{x_{ID}}, g\right) \\ ~~~ &=& e(Q_{ID}^{\underset{i\in{[N]}}{\sum}v_{i}}, P_{pub})\cdot e\left( \underset{i\in{[N]}}{\prod}T_{i}^{v_{i}}, U\right)\cdot e\left( \underset{i\in{[N]}}{\prod}{T_{i}^{\prime}}^{v_{i}} \cdot T^{\underset{i\in{[N]}}{\sum}v_{i}}, PK_{ID}\right) \end{array} $$

Thus, the verification algorithm on the original signature σ is correct.

Furthermore, given an identity ID and public key PKID, an identifier τ and a set of triples \(\{(c_{i}, \sigma _{i}, \boldsymbol {v}_{i})\}_{i=1}^{l} \), where σi = (U, Wi) and vi = (vi1, ⋯ , viN), if \(\sigma _{i} \leftarrow \) CL-HSign(ID, SKID, τ, vi), we need to prove that \( \sigma =(U, W=\underset {i\in {[l]}}{\prod } W_{i}^{c_{i}}) \) is a signature on \( \boldsymbol {y}=(y_{1 }, \cdots , y_{N})=\sum \limits _{i\in {[l]}}c_{i}\boldsymbol {v}_{i} \). By the correctness of each signature, we have

$$ e(W_{i}, g)= e\left( Q_{ID}^{\underset{j\in{[N]}}{\sum}v_{ij}}, P_{pub}\right)\cdot e\left( \underset{j\in{[N]}}{\prod}T_{j}^{v_{ij}}, U\right)\cdot e\left( \underset{j\in{[N]}}{\prod}{T_{j}^{\prime}}^{v_{ij}} \cdot T^{\underset{j\in{[N]}}{\sum}v_{ij}}, PK_{ID}\right) $$

Thus, by the bilinear property, we have

$$ \begin{array}{@{}rcl@{}} e(W, g)&=&\underset{i\in{[l]}}{\prod}e(W_{i}, g)^{{c_{i}}}\\ &=& e\left( Q_{ID}^{\underset{i\in{[l]}}{\sum}\underset{j\in{[N]}}{\sum}c_{i}v_{ij}}, P_{pub}\right)\cdot e\left( \underset{j\in{[N]}}{\prod}T_{j}^{\underset{i\in{[l]}}{\sum}c_{i}v_{ij}}, U\right)\\ &&\cdot e\left( \underset{j\in{[N]}}{\prod}{T_{j}^{\prime}}^{\underset{i\in{[l]}}{\sum}c_{i}v_{ij}} \cdot T^{\underset{i\in{[l]}}{\sum}\underset{j\in{[N]}}{\sum}c_{i}v_{ij}}, PK_{ID}\right)\\ &=&e(Q_{ID}, P_{pub})^{\underset{j\in{[N]}}{\sum}y_{j}}\cdot e\left( \underset{j\in{[N]}}{\prod}T_{j}^{y_{j}}, U{}\right)\cdot e\left( \underset{j\in{[N]}}{\prod}{T_{j}^{\prime}}^{y_{j}} \cdot T^{\underset{j\in{[N]}}{\sum}y_{j}}, PK_{ID}{}\right). \end{array} $$

Therefore, the verification algorithm on a combined signature σ is correct.

5 Security analysis

In this subsection, we analyze the security of the proposed scheme.

Theorem 1

Our certificateless linearly homomorphic signature scheme is unforgeable in the random oracle model assuming that the CDH problem in \( \mathbb {G}_{1} \) is infeasible.

This Theorem 1 is derived from the following two lemmas, with Definition 3.

Lemma 1

For any polynomial-time adversary \( \mathcal {A}_{\mathcal {I}} \), our certificateless linearly homomorphic signature scheme is unforgeable in the random oracle model assuming that the CDH problem in \( \mathbb {G}_{1} \) is infeasible.

Proof Idea::

during the adversary’s query process, the challenger assigns ga and gb in the random challenge of the CDH problem to some items, so the signature contains the term gab. In order to ensure that the unknown quantity gab does not affect the challenger’s response to the signing queries, the challenger carefully sets the hash values Ti(i ∈ [N]) and U value in the signing queries, so that the item that bring in the special Ti and U values can eliminate the one containing gab, while in the output phase, e(gab, g) can be retained. If an adversary outputs a valid forgery, the value gab can be solved from the verification algorithm equation by using the non-degeneracy of bilinear pairs. Moreover, it is proved that the probability of aborting simulation is negligible and the simulation process is complete.

Proof

Assume that \( \mathcal {A}_{\mathcal {I}} \) represents a third-party attack against the unforgeability of our CL-LHS scheme. We construct a simulator \(\mathcal C \) that uses \( \mathcal {A}_{\mathcal {I}} \) as a subroutine to solve the CDH problem. According to the definition of Game 1, adversary \( \mathcal {A}_{\mathcal {I}} \) eventually outputs either a Type 1 forgery or a Type 2 forgery. \( \mathcal {C} \) guesses the type of forgery that will be output by \( \mathcal {A}_{\mathcal {I}} \) based on the result of flipping a coin randomly. Clearly, \( \mathcal {C} \) guesses correctly with a probability of \( \frac {1}{2} \).

Case 1 (Type 1 forgery:):

In this case, \( \mathcal {C} \) has guessed that \( \mathcal {A}_{\mathcal {I}} \) will output a Type 1 forgery. Given a random challenge \( (\mathbb {G}_{1}, \mathbb {G}_{2}, e, p, g, g^{a}, g^{b}) \) of the CDH problem, the goal of \( \mathcal {C} \) is to compute the value of gab. \(\mathcal {C}\) interacts with \( \mathcal {A}_{\mathcal {I}} \) as follows:

  • Setup: \( \mathcal {C} \) runs Setup and sets Ppub = ga and params= \( (\mathbb {G}_{1}, \mathbb {G}_{2}, e, p, g, P_{pub}=g^{a}) \). It invokes \( \mathcal {A}_{\mathcal {I}} \) on the input params.

  • Queries: \( \mathcal {A}_{\mathcal {I}} \) can issue queries to the following oracles simulated by \( \mathcal {C} \).

    • H Queries: Suppose that \( \mathcal {A}_{\mathcal {I}} \) makes at most qH H queries. A list is maintained by \( \mathcal {C} \), referred to as LH. \( \mathcal {C} \) randomly chooses k ∈{1,2, ⋯ , qH} and guesses that the k-th identity IDk of the queries initiated by \( \mathcal {A}_{\mathcal {I}} \) is the challenge identity. When \( \mathcal {A}_{\mathcal {I}} \) sends an H query on identity ID, \( \mathcal {C} \) responds as follows:

    1. (1)

      If this is the k-th query, e.g., ID = IDk, output Q(ID) = H(ID) = gb and add < ID, gb,⊥> to LH.

    2. (2)

      Otherwise, choose a random number \( w_{ID}\in { \mathbb {F}^{*}_{p}} \), output \( Q(ID)=H(ID)=g^{w_{ID}} \) and add < ID, H(ID), wID > to LH.

    • Partial Private Key Extraction: \( \mathcal {C} \) maintains a list Lpart that is initially empty. When \( \mathcal {A}_{\mathcal {I}} \) asks for the partial private key of identity ID, if ID = IDk, \( \mathcal {C} \) aborts. Otherwise, it recovers the tuple < ID, H(ID), wID > from LH and returns the partial private key \( D_{ID}=(g^{a})^{w_{ID}}\) to \( \mathcal {A}_{\mathcal {I}} \). Then, it stores < ID, DID > in the list Lpart.

    • Public Key Queries: \( \mathcal {C} \) maintains a list LPK that is initially empty. Given an identity ID, \( \mathcal {C} \) randomly chooses \( x_{ID}\in {\mathbb {F}^{*}_{p}} \) as the secret value. Then, \( \mathcal {C} \) returns the public key \( PK_{ID}= g^{x_{ID}} \) to \( \mathcal {A}_{\mathcal {I}} \) and saves < ID, PKID, xID > in LPK.

    • Private Key Extraction: \( \mathcal {C} \) maintains a list LSK that is initially empty. Given an identity ID, \( \mathcal {C} \) performs the following actions:

    1. (1)

      If IDIDk, it recovers the tuple < ID, H(ID), wID > from LH and < ID, PKID, xID > from LPK. Then, \( \mathcal {C} \) returns the secret key \( SK_{ID}=((g^{a})^{w_{ID}}, x_{ID}) \) to \( \mathcal {A}_{\mathcal {I}} \) and adds < ID, SKID > to LSK.

    2. (2)

      Otherwise, \( \mathcal {C} \) aborts.

    • Replace Public Key: Suppose \( \mathcal {A}_{\mathcal {I}} \) sends a query with the input \( (ID, PK^{\prime }_{ID}) \). If the list LPK contains a tuple < ID, PKID, xID >, \( \mathcal {C} \) sets \( PK_{ID}=PK^{\prime }_{ID} \) and xID =⊥.

    • H1 Queries: Suppose (ID, Ppub, τ, U, i) is submitted to oracle H1(⋅). \( \mathcal {C} \) first scans for < (ID, Ppub, τ, U, i), Ti, ti > in the list \( L_{H_{1}} \) to check whether Ti has already been defined. If so, it returns the previously defined value. Otherwise, \( \mathcal {C} \) randomly chooses a number \( t_{i}\in { \mathbb {F}^{*}_{p}} \), returns \( T_{i}=g^{t_{i}} \) to \( \mathcal {A}_{\mathcal {I}} \) as the hash value of H1(ID, Ppub, τ, U, i), and stores the value in the list \( L_{H_{1}} \).

    • H2 Queries: Suppose (ID, PKID, τ, i) is submitted to oracle H2(⋅). \( \mathcal {C} \) first scans for \( <(ID, PK_{ID}, \tau , i), T^{\prime }_{i}, t^{\prime }_{i}> \) in the list \( L_{H_{2}} \) to check whether \( T^{\prime }_{i} \) has already been defined. If so, it returns the previously defined value. Otherwise, \( \mathcal {C} \) randomly chooses a number \( t^{\prime }_{i}\in {\mathbb {F}^{*}_{p}} \), returns \( T^{\prime }_{i}=g^{t^{\prime }_{i}} \) to \( \mathcal {A}_{\mathcal {I}} \) as the hash value of H2(ID, PKID, τ, i), and stores the value into the list \( L_{H_{2}} \).

    • H3 Queries: \( \mathcal {C} \) maintains a list \(L_{H_{3}} \) containing tuples < (ID, PKID), T, t >. Upon receiving \( \mathcal {A}_{\mathcal {I}} \)’s query on (ID, PKID), if it already exists in \(L_{H_{3}} \), \( \mathcal {C} \) returns T. Otherwise, \( \mathcal {C} \) chooses at random a number \( t\in {\mathbb {F}^{*}_{p}} \), returns T = gt to \( \mathcal {A}_{\mathcal {I}} \) as the hash value of H3(ID, PKID), and stores the value into the list \( L_{H_{3}} \).

    • Signing Queries: Given an identity ID, a vector space \( V\subset \mathbb {F}^{N}_{p} \) is described by augmented basis vectors \( \boldsymbol {v}_{1}, \cdots , \boldsymbol {v}_{m} \in \mathbb {F}^{N}_{p}\), where vi = (vi1, ⋯ , vin, \(\underbrace { 0,\cdots , 1}_{i} , \cdots , 0) \).

    If ID is the challenge identity, i.e., ID = IDk, \( \mathcal {C} \) performs the following steps:

    1. (1)

      Randomly choose an identifier \( \tau \leftarrow \{0, 1\}^{k} \) and numbers \( r, u_{i}\in {\mathbb {F}^{*}_{p}} (i\in [N])\), and set \( U=P^{r}_{pub}=(g^{a})^{r} \).

    2. (2)

      Define the hash values of H1(ID, Ppub, τ, U, i) as \( T_{i}=(\frac {g^{u_{i}}}{Q_{ID}} )^{r^{-1}} \in {\mathbb {G}_{1}}\). \( \mathcal {C} \) aborts if H1(ID, Ppub, τ, U, i) has already been queried for some i ∈ [N].

    3. (3)

      Recover \( T^{\prime }_{i} (i\in [N]) \) and T from \( L_{H_{2}} \) and \( L_{H_{3}} \), respectively; if there are no such items, \( \mathcal {C} \) makes queries to oracles H2(⋅) and H3(⋅).

    4. (4)

      Finally, \( \mathcal {C} \) computes

      $$ \begin{array}{@{}rcl@{}} W_{i} =(P_{pub})^{\underset{j\in{[N]}}{\sum}u_{j}v_{ij}}\cdot (PK_{ID})^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij}+t\underset{j\in{[N]}}{\sum}v_{ij}} \end{array} $$

    Now σi = (U, Wi)(i ∈ [m]) are returned to \( \mathcal {A}_{\mathcal {I}} \); σi is a valid signature, since

    $$ \begin{array}{@{}rcl@{}} &&\underline{e(Q_{ID}, P_{pub})^{\underset{j\in{[N]}}{\sum}v_{ij}}}_{1}\cdot \underline{e\left( \underset{j\in{[N]}}{\prod}T_{j}^{v_{ij}}, U\right)}_{2}\cdot e\left( \underset{j\in{[N]}}{\prod}{T_{j}^{\prime}}^{v_{ij}} \cdot T^{\underset{j\in{[N]}}{\sum}v_{ij}}, PK_{ID}\right) \end{array} $$
    (1)
    $$ \begin{array}{@{}rcl@{}} &=&\underline{e(Q_{ID}, P_{pub})^{\underset{j\in{[N]}}{\sum}v_{ij}}}_{1}\cdot \underline{e\left( \underset{j\in{[N]}}{\prod}(\frac{g^{u_{j}}}{Q_{ID}} )^{r^{-1}{v_{ij}}}, P^{r}_{pub}\right)}_{2'}\cdot e\left( g^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij} + t\underset{j\in{[N]}}{\sum}v_{ij}}, PK_{ID}\right) \end{array} $$
    (2)
    $$ \begin{array}{@{}rcl@{}} &=& \underline{e(Q_{ID}, P_{pub})^{\underset{j\in{[N]}}{\sum}v_{ij}}}_{1}\cdot \underline{e(Q_{ID}, P_{pub})^{-\!\underset{j\in{[N]}}{\sum}v_{ij}}}_{2'_{(1)}}\cdot \underline{e(g^{\underset{j\in{[N]}}{\sum}u_{j}v_{ij}}, P_{pub})}_{2'_{(2)}}\\ &&\cdot e\left( g^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij} + \underset{j\in{[N]}}{\sum}tv_{ij}}, PK_{ID}\right) \end{array} $$
    (3)
    $$ \begin{array}{@{}rcl@{}} &=& e\left( (P_{pub})^{\underset{j\in{[N]}}{\sum}u_{j}v_{ij}}\cdot (PK_{ID})^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij} + \underset{j\in{[N]}}{\sum}tv_{ij}}, g\right) \end{array} $$
    (4)
    $$ \begin{array}{@{}rcl@{}} &=& e (W_{i}, g) \end{array} $$
    (5)

    The derivation process of the core part of the above series of equations is shown in note.Footnote 1

    Otherwise,Footnote 2\( \mathcal {C} \) randomly chooses an identifier \( \tau \leftarrow \{0, 1\}^{k} \) and a number \( r\in {\mathbb {F}^{*}_{p}}\), sets U = gr, and computes

    $$ W_{i} =(g^{a})^{w_{ID}\underset{j\in{[N]}}{\sum}v_{ij}}\cdot U^{\underset{j\in{[N]}}{\sum}t_{j}v_{ij}}\cdot (PK_{ID})^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij} + \underset{j\in{[N]}}{\sum}tv_{ij}} $$

    Now, σi = (U, Wi)(i ∈ [m]) are returned to \( \mathcal {A}_{\mathcal {I}} \); σi is a valid signature, since

    $$ \begin{array}{@{}rcl@{}} &&e (W_{i}, g)\\ &=&e(Q_{ID}, P_{pub})^{\underset{j\in{[N]}}{\sum}v_{ij}}\cdot e{}\left( \underset{j\in{[N]}}{\prod}T_{j}^{v_{ij}}, U\right)\cdot e\left( \underset{j\in{[N]}}{\prod}{T_{j}^{\prime}}^{v_{ij}} \cdot T^{\underset{j\in{[N]}}{\sum}v_{ij}}, PK_{ID}\right)\\ &=&e(g^{w_{ID}}, g^{a})^{\underset{j\in{[N]}}{\sum}v_{ij}}\cdot e(U^{\underset{j\in{[N]}}{\sum}t_{j}v_{ij}}, g)\cdot e\left( (PK_{ID})^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij} + t\underset{j\in{[N]}}{\sum}v_{ij}}, g\right)\\ &=& e\left( (g^{a})^{w_{ID}\underset{j\in{[N]}}{\sum}v_{ij}}\cdot U^{\underset{j\in{[N]}}{\sum}t_{j}v_{ij}}\cdot (PK_{ID})^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij} + \underset{j\in{[N]}}{\sum}tv_{ij}}, g\right)\\ \end{array} $$
  • Output: Eventually, \( \mathcal {A}_{\mathcal {I}} \) outputs a tuple \( (ID^{*}, PK_{ID^{*}}\), v, τ, σ), where \( \boldsymbol {v}^{*}=(v^{*}_{1}, \cdots , v^{*}_{N} ) \) and σ = (U, W). If IDIDk, then \( \mathcal {C} \) aborts. Otherwise, for each i ∈ [N], it retrieves the items \( T^{*}_{i}\) from \( L_{H_{1}} \), the items \( T^{\prime *}_{i} \) from \( L_{H_{2}} \), and the item T from \( L_{H_{3}} \); if there are no such items, \( \mathcal {C} \) makes queries to the corresponding oracle. If \( \mathcal {A}_{\mathcal {I}} \) successfully outputs Type 1 forgery signatures, the file identifier ττi for all τi that appear in signing queries, and note that \( T^{*}_{i} =g^{t^{*}_{i}}\),Footnote 3\( T^{\prime *}_{i} =g^{t^{\prime *}_{i}}, T^{*} =g^{t^{*}}\), then the following equation holds:

    $$ \begin{array}{@{}rcl@{}} &&e (W^{*},g)\\ &=&e(Q_{ID^{*}}, P_{pub})^{\underset{i\in{[N]}}{\sum}v^{*}_{i}}\cdot e\left( \underset{i\in{[N]}}{\prod}(T^{*}_{i})^{v^{*}_{i}}, U^{*}\right)\cdot e\left( \underset{i\in{[N]}}{\prod}{(T^{\prime*}_{i}})^{v^{*}_{i}} \cdot (T^{*})^{\underset{i\in{[N]}}{\sum}v^{*}_{i}}, PK_{ID^{*}}\right)\\ &=&e(Q_{ID^{*}}, P_{pub})^{\underset{i\in{[N]}}{\sum}v^{*}_{i}}\cdot e\left( (U^{*})^{\underset{i\in{[N]}}{\sum}t^{*}_{i}v^{*}_{i}}, g)\cdot e((PK_{ID^{*}})^{\underset{i\in{[N]}}{\sum}t^{\prime*}_{i}v^{*}_{i}+t^{*}\underset{i\in{[N]}}{\sum}v^{*}_{i}}, g\right). \end{array} $$

    Therefore, we have the following equation:

    $$ \begin{array}{@{}rcl@{}} &&e\left( \frac{W^{*}}{(U^{*})^{\underset{i\in{[N]}}{\sum}t^{*}_{i}v^{*}_{i}}\cdot (PK_{ID^{*}})^{\underset{i\in{[N]}}{\sum}t^{\prime*}_{i}v^{*}_{i}+t^{*}\underset{i\in{[N]}}{\sum}v^{*}_{i}} }, g\right)\\ & =&e(Q_{ID^{*}}, P_{pub})^{\underset{i\in{[N]}}{\sum}v^{*}_{i}}\\ & =&e(g^{b}, g^{a})^{\underset{i\in{[N]}}{\sum}v^{*}_{i}}\\ &=&e(g, g)^{ab\cdot\underset{i\in{[N]}}{\sum}v^{*}_{i}} \end{array} $$

    Thus, by the nondegenerate property, the value of gab is the following expression:

    $$ \left( \frac{W^{*}}{(U^{*})^{\underset{i\in{[N]}}{\sum}t^{*}_{i}v^{*}_{i}}\cdot\! (PK_{ID^{*}})^{\underset{i\in{[N]}}{\sum}t^{\prime*}_{i}v^{*}_{i}+\!\underset{i\in{[N]}}{\sum}t^{*}v^{*}_{i}} }\right)^{\frac{1}{\underset{i\in{[N]}}{\sum}v^{*}_{i}}}. $$

Now, we evaluate \( \mathcal {C} \)’s probability of success.

We first analyze the probability of aborts in handling a signing query. The probability of the event that \( \mathcal {C} \) responds to two distinct signature queries by choosing the same identifier τ is at most \( \frac {{q_{s}^{2}}}{2^{k}}\), while the probability of the event that \( \mathcal {A}_{\mathcal {I}} \) has already requested the value of H1(ID, Ppub, τ, U, i) for some i is at most \( \frac {q_{H_{1}}\cdot q_{s}}{2^{k}}\).

Then, we can readily check that the probability of not aborting in key extraction queries and in the output stage is \( (1-\frac {1}{q_{H}})^{q_{part}} \) and \( \frac {1}{q_{H}} \), respectively, where qs, qH, qpart are the numbers of signing queries, H hash queries and partial private key extractions performed by \(\mathcal {A}_{\mathcal {I}} \).

Therefore, if \(\mathcal {A}_{\mathcal {I}} \) has an advantage \(Adv_{\mathcal {A}_{\mathcal {I}}}^{CL-LHS}(k)\) in forging a signature in Game 1, then \( \mathcal {C} \) solves the CDH problem with probability

$$ \left( \frac{1}{2}Adv_{\mathcal{A}_{\mathcal{I}}}^{CL-LHS}(k)-\frac{{q_{s}^{2}}+q_{H_{1}}\cdot q_{s}}{2^{k}}\right)\cdot\left( 1-\frac{1}{q_{H}}\right)^{q_{part}}\cdot \frac{1}{q_{H}}. $$
Case 2 (type 2 forgery:):

In this case, \( \mathcal {C} \) has guessed that \( \mathcal {A}_{\mathcal {I}} \) will output a type 2 forgery. Given a CDH instance, \( (\mathbb {G}_{1}, \mathbb {G}_{2}, e, p, g, g^{a}, g^{b}) \), the goal of \( \mathcal {C} \) is to compute the value of gab by using \( \mathcal {A}_{\mathcal {I}} \) as a subroutine. \( \mathcal {C} \) interacts with \( \mathcal {A}_{\mathcal {I}} \) as follows:

  • Setup: \( \mathcal {C} \) chooses a random number \( s\in { \mathbb {F}^{*}_{p}} \) as the master key and sets Ppub = gs and params= \( (\mathbb {G}_{1}, \mathbb {G}_{2}, e, p, g, P_{pub}=g^{s}) \). It invokes \( \mathcal {A}_{\mathcal {I}} \) on input params.

  • Queries: \( \mathcal {C} \) simulates the oracle queries of \( \mathcal {A}_{\mathcal {I}} \) as follows:

    • H Queries: \( \mathcal {C} \) maintains a list LH that is initially empty. Given an identity ID, \( \mathcal {C} \) chooses a random number \( w_{ID}\in { \mathbb {F}^{*}_{p}} \), computes \( Q_{ID}=g^{w_{ID}} \) as the value of H(ID), returns it to \( \mathcal {A}_{\mathcal {I}} \), and adds < ID, H(ID), wID > to LH.

    • H1(H2, H3) Queries: Same as in Case 1.

    • Partial Private Key Extraction: Given an identity ID, \( \mathcal {C} \) retrieves the tuple < ID, H(ID), wID > from LH and returns the partial private key DID = H(ID)s to \( \mathcal {A}_{\mathcal {I}} \). Then, it stores < ID, DID > in a list Lpart which is initially empty.

    • Private Key Extraction: \( \mathcal {C} \) maintains a list LSK of tuples < ID, SKID >. Given an identity ID, \( \mathcal {C} \) recovers the tuple < ID, DID > from Lpart and chooses a random number \( x_{ID}\in { \mathbb {F}^{*}_{p}} \) as the secret value. Then, it returns the secret key SKID = (DID, xID) to \( \mathcal {A}_{\mathcal {I}} \) and adds < ID, SKID > to LSK.

    • Public Key Queries: \( \mathcal {C} \) maintains a list LPK of tuples < ID, PKID >. Given an identity ID, \( \mathcal {C} \) recovers the tuple < ID, SKID > from LSK, returns the public key \( PK_{ID}= g^{x_{ID}} \) to \( \mathcal {A}_{\mathcal {I}} \) and saves < ID, PKID > in LPK.

    • Replace Public Key: Suppose \( \mathcal {A}_{\mathcal {I}} \) sends a query with the input \( (ID, PK^{\prime }_{ID}) \). If the list LPK contains the tuple < ID, PKID >, \( \mathcal {C} \) sets \( PK_{ID}=PK^{\prime }_{ID} \).

    • Signing Queries: Given an identity ID and a vector space \( V\subset \mathbb {F}^{N}_{p} \) described by augmented basis vectors \( \boldsymbol {v}_{1}, \cdots , \boldsymbol {v}_{m} \in \mathbb {F}^{N}_{p}\), where \( \boldsymbol {\boldsymbol {v}}_{i}=(v_{i1}, \cdots , v_{in}, \underbrace { 0,\cdots , 1}_{i} , \cdots , 0) \), \( \mathcal {C} \) preforms the following steps:

    1. (1)

      Randomly choose an identifier \( \tau \leftarrow \{0, 1\}^{k} \) and numbers \( r, \alpha _{1}, \cdots , \alpha _{n}\in {\mathbb {F}^{*}_{p}}\), and set U = (gb)r.

    2. (2)

      Set n = Nm, and for each i ∈ [n], compute

      $$ \begin{array}{@{}rcl@{}} T_{i}=H_{1} (ID, P_{pub}, \tau, U, i)=(g^{a} )^{\alpha_{i}} \end{array} $$

      for each i ∈ [m], compute

      $$ \begin{array}{@{}rcl@{}} &&\beta_{i}=-\underset{j\in{[n]}}{\sum}\alpha_{j}v_{ij},\\ &&T_{n+i}=H_{1} (ID, P_{pub}, \tau, U, n+i)=(g^{a} )^{\beta_{i}}, \end{array} $$

      and set α = (α1, ⋯ , αn, β1, ⋯ , βm). Now observe that we constructed α so that αV (i.e., αv = 0, for all vV = Span{v1, ⋯ , vm}).Footnote 4\(\mathcal {C} \) aborts if H1(ID, Ppub, τ, U, i) has already been queried for some i ∈ [N].

    3. (3)

      Recover \( T^{\prime }_{i} \), T and SKID from \( L_{H_{2}} \), \( L_{H_{3}} \) and LSK, respectively; if there are no such items, \( \mathcal {C} \) makes queries on the corresponding oracle.

    4. (4)

      Finally, compute

      $$ W_{i} =(D_{ID})^{\underset{j\in{[N]}}{\sum}v_{ij}}\cdot (PK_{ID})^{\underset{j\in{[N]}}{\sum}t^{\prime}_{j}v_{ij}+t\underset{j\in{[N]}}{\sum}v_{ij}} $$

    Now, σi = (U, Wi)(i ∈ [m]) are returned to \( \mathcal {A}_{\mathcal {I}} \); we show that σi is a valid signature, since

    The core part of the derivation process of the above series of equations is is shown in note.Footnote 5 The first equation above comes from U = (gb)r and the definition of the signature of the proposed scheme, the second equation comes from the introduction of specific expressions of \( T_{j}, T^{\prime }_{j} \) and T. After rearrangement, the third equation is obtained. The last equation holds since we constructed α such that αv = 0 for all vV. Hence, the signatures output by \( \mathcal {C} \) in step (4) are valid signatures.

  • Output: Eventually, \( \mathcal {A}_{\mathcal {I}} \) outputs \( ID^{*}, PK_{ID^{*}}\), an identifier τ, a nonzero vector y = (y1, ⋯ , yN) and signatures \( \sigma _{i}^{*}=(U^{*}, W^{*}_{i}), i\in {[m]}\).

    If \( \mathcal {A}_{\mathcal {I}} \) successfully outputs Type 2 forgery signatures, then τ has been used to respond to a vector subspace V in a signature query, but yV, so it is known that U = (gb)r, \( T^{*}_{i}=H_{1} (ID^{*}, P_{pub}, \tau ^{*}, U^{*}, i)=(g^{a} )^{\alpha _{i}} (i\in {[n]})\), \( T^{*}_{n+i}=H_{1} (ID^{*}, P_{pub}, \tau ^{*}, U^{*}, n+i)=(g^{a})^{\beta {i}} (i\in {[m]}) \), and Verify \((ID^{*}, PK_{ID^{*}}, \tau ^{*}, \boldsymbol {y}, \sigma ^{*})=1 \). \( \mathcal {C} \) recovers \( T^{\prime *}_{i} \) from the list \( L_{H_{2}} \), T from the list \( L_{H_{3}} \) and \( D_{ID^{*}} \) from the list Lpart, note that \( T^{\prime *}_{i}=g^{t^{\prime *}_{i}} \), \( T^{*}=g^{t^{*} }\); then, the following equation holds:

    $$ \begin{array}{@{}rcl@{}} &&e\left( \underset{i\in{[m]}}{\prod}{(W^{*}_{i})}^{y_{n+i}}, g\right)\\ &=&e\left( Q_{ID^{*}}^{\underset{i\in{[N]}}{\sum}y_{i}}, P_{pub}\right)\cdot e\left( \underset{i\in{[N]}}{\prod}{(T^{*}_{i})}^{y_{i}}, U^{*}\right)\cdot e\left( \underset{i\in{[N]}}{\prod}{(T^{\prime*}_{i})}^{y_{i}} \cdot {(T^{*})}^{\underset{i\in{[N]}}{\sum}y_{i}}, PK_{ID^{*}}\right)\\ &=&e\left( (D_{ID^{*}})^{\underset{i\in{[N]}}{\sum}y_{i}}, g\right)\cdot e\left( (g^{ab})^{(\boldsymbol {\alpha}\cdot \boldsymbol {y})r}, g\right)\cdot e\left( (PK_{ID^{*}})^{\underset{i\in{[N]}}{\sum}t^{\prime*}_{i}y_{i}+t^{*}\underset{i\in{[N]}}{\sum}y_{i}}, g\right) \end{array} $$

    The first equation above comes from the verification algorithm of the proposed scheme, and the second equation comes from the concrete expression brought into \( T^{*}_{i}, T^{\prime *}_{i}, T^{*} \) and U.

    Therefore, by the nondegenerate property, we have

    $$ \underset{i\in{[m]}}{\prod}{(W^{*}_{i})}^{y_{n+i}}=(D_{ID^{*}})^{\underset{i\in{[N]}}{\sum}y_{i}} \cdot (g^{ab})^{(\boldsymbol {\alpha}\cdot \boldsymbol {y})r}\cdot (PK_{ID^{*}})^{\underset{i\in{[N]}}{\sum}t^{\prime*}_{i}y_{i}+t^{*}\underset{i\in{[N]}}{\sum}y_{i}} $$

    If αy ≠  0, then \( \mathcal {C} \) can compute the value of gab as follows:

    $$ \begin{array}{@{}rcl@{}} \left( \frac{\underset{i\in{[m]}}{\prod}{(W^{*}_{i})}^{y_{n+i}}}{(D_{ID^{*}})^{\underset{i\in{[N]}}{\sum}y_{i}} \cdot (PK_{ID^{*}})^{\underset{i\in{[N]}}{\sum}t^{\prime*}_{i}y_{i} + \underset{i\in{[N]}}{\sum}t^{*}y_{i}}}\right)^{\frac{1}{(\boldsymbol {\alpha}\cdot \boldsymbol {y})r}} \end{array} $$

Now, we evaluate \( \mathcal {C} \)’s probability of success.

As in the case of \( \mathcal {A}_{\mathcal {I}} \) forging a Type 1 signature, the probability of \( \mathcal {C} \) aborting the signing query is at most \( \frac {{q_{s}^{2}}+q_{H_{1}}\cdot q_{s}}{2^{k}} \).

Since \( \mathcal {A}_{\mathcal {I}} \) outputs a Type 2 forgery, yV. Note that αi(i ∈ [n]) are independently and uniformly selected in \( \mathbb {F}^{*}_{p} \); then, α = (α1, ⋯ , αn, β1, ⋯ , βm) is uniformly random in V. Therefore, for any yV, αy is uniform in \(\mathbb {F}^{*}_{p} \), and we find that αy = 0 with probability \( \frac {1}{p} \).

Therefore, if \(\mathcal {A}_{\mathcal {I}} \) has an advantage \(Adv_{\mathcal {A}_{\mathcal {I}}}^{CL-LHS}(k)\) in forging a signature in Game 1, then \( \mathcal {C} \) can solve the CDH problem with probability

$$ \begin{array}{@{}rcl@{}} \left( \frac{1}{2}Adv_{\mathcal{A}_{\mathcal{I}}}^{CL-LHS}(k)-\frac{{q_{s}^{2}}+q_{H_{1}}\cdot q_{s}}{2^{k}}\right)\cdot\left( 1-\frac{1}{p}\right). \end{array} $$

Lemma 2

For any polynomial-time adversary \( \mathcal {A}_{\mathcal {I}\mathcal {I}} \), our certificateless linearly homomorphic signature scheme is unforgeable in the random oracle model assuming that the CDH problem in \( \mathbb {G}_{1} \) is infeasible.

The proof of Lemma 2 is similar with that of Lemma 1. The difference between the proof of Lemmas 2 and 1 is that the positions of embedding hard problem are different. We omit the proof here for simplicity and show the proof of Lemma 2 in Appendix.

6 Application in IoT environments and performance comparison

6.1 System model of authentication computing using CL-LHS in an IoT environment

In the IoT environment, homomorphic signatures can be used not only to protect applications based on network coding but also to perform the authentication calculation of the linear function of signed data. Although the IoT provides great convenience for production and life, the storage and calculation of massive data is still a major challenge due to limited computing power and storage resources. In recent years, cloud computing technology has developed rapidly, and some common cloud service products have been released and received wide attention. Because of its convenience and rapidity, an increasing number of users choose to upload their data to the cloud server and compute their own data. Of course, the correctness of server computing is a major issue. As the most natural application of homomorphic signatures, server computing can ensure that “correct data” is “correctly operated on” and that “correct results” are obtained in the system assuming that there are some untrusted parties (such as cloud data processors). Suppose the user wants to perform a large computation, but she does not have such a powerful resource. Then, she can use her secret key to sign a large data set and then distribute the signed data to an untrusted cloud server to calculate the data. The cloud server then derives the signature on the calculated results homomorphically. This signature can prove that the data processor outputs the correct calculation result. As shown in Fig. 4, the system model of authentication computing using certificateless linearly homomorphic signatures in the IoT environment consists of three components: a key generation center (KGC), data cloud server and IoT device.

  • KGC: The KGC is responsible for generating system parameters and calculating partial private keys for each IoT device. Then, these partial private keys are sent to each entity through a secure channel, and system parameters are sent to all entities through a public channel.

  • IoT device: The KGC generates a unique partial private key for each registered IoT device equipped with sensors. To ensure the integrity and authenticity of the data, each IoT device uses the system parameters and private key to sign the collected original data separately. Then, the IoT device sends the message, corresponding signature and public key to the cloud server.

  • Cloud server: The cloud server has powerful computing power and storage space to verify the validity of all received signatures. If they are valid, homomorphic signatures are used for various calculations on the data, which can be completed through minimal interaction and communication, including the calculation results and corresponding short signatures sent from the server to the IoT devices.

Fig. 4
figure 4

System model of authentication computing using CL-LHS in an IoT environment

6.2 Performance analysis

In this subsection, we mainly carry out the performance analysis. Table 2 compares the performance of our CL-LHS scheme with related schemes in the literature, e.g., [19, 23, 24, 26, 27] under random oracles in terms of private key size, signature length, verification cost, and security. Since references [26, 27] both used identity-based signature as module to design identity-based linearly homomorphic signature schemes. Therefore, in the efficiency analysis, we instantiate the module with the identity-based signature schemes proposed by reference [49, 50]. For convenience, the resulting schemes are denoted as schemes [26]1, [27]1 and schemes [26]2, [27]2, respectively. In addition, literature [19] used a general signature scheme as a module to design a linearly homomorphic signature scheme, so we use the BLS short signature to instantiate the module, and record the obtained scheme as [19]1. The SkSize and SigSize columns show the size of the private key and signature, respectively. The verify column presents the computational costs of the algorithms Verify. Column Type I, II lists whether the scheme can resist public key replacement attacks and malicious-but-passive KGC attacks. The CL-PKC (ID-PKC) columns denote whether a scheme is based on a certificateless cryptosystem (identity-based cryptosystem). The Hardness columns denote the hardness assumption on which the security of the scheme depends. Let |p|, \( \vert \mathbb {G}_{1}\vert \) and \( \vert \mathbb {G}_{2}\vert \) represent the lengths of elements in \(\mathbb {F}_{p}\), \( \mathbb {G}_{1} \) and \(\mathbb {G}_{2} \), respectively.

Table 2 A comparison of performance and security

Note that the length of the private key affects the storage capacity of IoT devices, and the signature length affects the storage capacity and the communication capability of the IoT device. In addition, the computational cost of the algorithm Verify affects the computing power of both IoT devices and cloud servers. According to Table 2, the size of the private key of our scheme is shorter than that of the instantiated schemes [26]1, [26]2, [27]1, [27]2, and is the same as those of the schemes in [23] and [24]. The size of the signature of our scheme is shorter than those of the instantiated schemes of [19, 26, 27] and slightly larger than those of the schemes in [23, 24]. The verification algorithm of our scheme needs four bilinear pairs, which is roughly the same as is needed for the schemes in [23] and the instantiated schemes of [19, 26, 27]. However, our scheme addresses the issues of certificate management and key escrow and thus provides higher security.

In order to provide numerical results, we implement the proposed CL-LHS scheme and four related schemes, namely [19]1, [23] and [26]1, [27]1, where [19]1 and [23] are certificate-based schemes, while [26, 27] are certificateless schemes. Our implementation was run on a laptop with a 3.10-GHz Intel i5 CPU, 64 GB memory, and the Ubuntu Linux operating system. We chose the Type A curve in the PBC library [51]. The pairing operation is based on the curve y2 = x3 + x over the field \( \mathbb {F}_{p} \). The security levels are chosen to be |p| = 512 bits.

Because IoT devices must secretly store their private keys, a small-sized private key is applicable in IoT devices with limited storage capacity. According to Fig. 5, the size of the private key in our CL-LHS scheme is 148 bits, which is the same as that in [23] and [24], and is 57.8% of that in [26]1 and [27]1.

Fig. 5
figure 5

A comparison of the private key size

Due to the limited battery power and communication bandwidth of IoT devices, signature size is the key factor affecting communication costs, so one of the tasks of our CL-LHS scheme is to reduce the communication overhead of devices in the IoT. As shown in Fig. 6, the signature size of our CL-LHS scheme is 256 bits, compared with [19]1, [26]1 and [27]1, the signature size of our proposed scheme is reduced by 33.35%, 36.63% and 51.87%, respectively. Although the signature size of our CL-LHS scheme is larger than that of the schemes in [24] and [23], the literature [24] lacks the security proof for the identity-based homomorphic signature scheme proposed, and [23] is faced with a thorny certificate management issue. Hence, the proposed CL-LHS scheme has a lower communication overhead.

Fig. 6
figure 6

A comparison of the communication cost

We compare the private key extraction cost of our scheme with the only three ID-LHS schemes based on bilinear pairing. As shown in Fig. 7, our extraction algorithm is faster than that of schemes [26]1 and [27]1 and slower than that of [24], but the scheme [24] lacks security proof. Figures 8 and 9 show the running time of signature generation and verification algorithms of the schemes. The x-axis is the dimension of the vector to be signed, and the y-axis is the time required by the corresponding algorithm. Overall, our CL-LHS scheme is less computationally efficient than but still comparable with the four related schemes, but it eliminates the problems of certificate management and key escrow, provides stronger security guarantees and better protects the privacy of users.

Fig. 7
figure 7

A comparison of the Extract cost

Fig. 8
figure 8

A comparison of the signature generation cost

Fig. 9
figure 9

A comparison of the signature verification cost

7 Conclusions

We constructed the first CL-LHS for network coding, which not only supports the authentication calculation of the linear function of the signed data to effectively mitigate pollution attacks in network coding but also solves the problems of certificate management and key escrow. To summarize, the scheme combines the properties of LHS and a certificateless signature. We proved that the scheme is secure against an adaptively chosen dataset attack under the random oracle model, even in the presence of type 1 and type 2 adversaries. Furthermore, compared to related schemes, our CL-LHS scheme has a smaller key size and a shorter signature length, and has comparable computation cost.

This work presents some interesting possibilities for future study. Since our scheme is unforgeable against adaptively chosen dataset attacks, it would be interesting to construct a CL-LHS scheme that is secure in a stronger security model that allows fully adaptive queries at the message level. As the CL-LHS scheme provides the credentials of the results calculated by a given function on a dataset, which are calculated by untrusted parties (e.g., the cloud), CL-LHS is very suitable for application in the cloud computing environment, such as in a smart grid, an e-voting system, or electronic health records. Proposing such applications is also the goal of our future work.