1 Introduction

The election scheme should be secure and robust enough to resist a variety of malicious attacks to protect the sensitive information. Moreover, the transparency and comprehensibility are also vital, otherwise, the election result is not easily accepted by the public, and some literatures have tried to address these issues [1, 2]. Usually, they are divided into the traditional election and electronic election. The former needs a trusted election device, an authorized organizer, and the complicated mechanisms. In detail, the paper ballot, the handling of the ballot boxes and counting process in paper-based voting scheme must be trusted by all participators. Next, the process of the traditional election needs the time and resource, such as, (1) the cost is high since the paper ballots spend much money and a lot of workers must be hired; (2) inconvenient since it requires the voter casts his ballot in the voting booth [3, 4]. In addition, it is possible that the boxes are lost, manipulated or destructed.

To overcome these difficulties, the electronic election is researched in recent years, which is different from the traditional election in many ways. In fact, the cryptographic technique is the important tool in designing the e-voting scheme. The first e-voting scheme is proposed by Chaum [5], then, many researchers focus this area [6,7,8,9,10,11,12]. The necessary security properties include: (1) confidentiality, nobody can know other voter’s content except himself, (2) non-cheating, no one can cheat others in the process of voting, (3) anonymity, a voter’s ballot should not be linked with his identity, (4) verifiability, every voter can verify if his ballot is correctly tallied, (5) coercion-resistance, the voter proves nothing of his ballot to others to prevent from selling the vote, (6) authentication, each voter must be identified as a legal participator, and only casts once.

The development of the e-voting can be divided into two stages: 1) the security requirements are guaranteed using the complicated encryption technique such as mix-nets, homomorphic encryption and blind signatures, etc. The related protocols are listed in the Table 1 to achieve the security requirements using different methods. These protocols are computationally secure under the assumption that the adversary has the bounded computing power. 2) the protocols are unconditional security, i.e., it is still secure even if the adversary owns the unbounded computing power [34,35,36].

Table 1 Related works

In this paper, an unconditional secure e-voting scheme is proposed based on secret sharing and k-anonymity, which not only ensures the ballot to be correctly tallied, but also guarantees each voter to verify the correctness of the result without knowing others’ information. Moreover, the proposed protocol is efficient.

Note that the original idea has been presented in the conference [37], in the current version, more detailed description is added to make it more easily understandable, for example, the security assumption, the design goals. Especially, an example is used to illustrate the proposed scheme, and the analysis proves that the claimed goals are really achieved.

The rest of this paper is organized as follows. In Section 2, some preliminaries are introduced, and the system model is presented in Section 3. In Section 4, the proposed voting protocol is presented with an example to make it more easily readable. Finally, the analysis and the conclusion are respectively presented in Section 5 and Section 6.

2 Preliminaries

The following cryptographic concepts are necessary for understanding the proposed scheme.

2.1 Shamir’s (t,n) secret sharing scheme

In 1979, secret sharing schemes were introduced in [38, 39]. Primarily, a mutually trusted dealer D divides a secret s into n shares that are securely shared among n shareholders. Then, no fewer than t shares can recover the secret data s easily, and fewer than t shares gain nothing about s. There are vast research papers on secret sharing schemes [40,41,42,43].

Shamir’s (t,n) secret sharing scheme is an unconditionally secure scheme without relying on any computational hard problem, which consists of n shareholders {P1, P2, ⋯, Pn} and D, and includes two algorithms over a finite field Fp, where p is a secure prime.

Algorithm 1 Shares generation

 

Input:

n shareholders {P1, P2Pn}, D, and the secret s.

Output:

1: D picks f(x) = s + a1x + a2x2 + ⋯at − 1xt − 1, ai ∈ Fp(i = 1, ⋯, t − 1);

2: D computes yi = f(xi), (i = 1, 2, ⋯, n), where xi is the identification of Pi;

3: D sends yi to Pi via a secure channel, (i = 1, 2, ⋯, n).

 

Algorithm 2 Secret reconstruction

 

No fewer than t shareholders can recover the secret s.

s is recovered by computing \( s=f(0)=\sum \limits_{i=1}^tf\left({x}_i\right)\prod \limits_{v=1,v\ne 1}^t\frac{-{x}_v}{x_l-{x}_v}\operatorname{mod}p \).

 

2.2 Secret sharing homomorphism

Secret sharing homomorphism was introduced by Benaloh [44]. Assuming there are two secrets s1, s2, they are shared by two polynomials f(x) and g(x) respectively. f(i) + g(i), (1 ≤ i ≤ n) can be regarded as the shares corresponding to s1 + s2, which are distributed to n shareholders, and any t of them can recover the result.

2.3 k-anonymity

k-anonymity means that any element included in a set appears with the probability no greater than 1/k, i.e., for any element, there are at least other k–1 indistinguishable elements in this set [45]. For example, T(A1, A2, ⋯, An) is a table with n attributes (A1, A2, ⋯, An). If each sequence of values in a set of attributes appears with at least k occurrences [46], T is k-anonymous.

In this voting scheme, any random k voters’ receipts are generated and published, nobody can distinguish the individual one [47] to guarantee the k-anonymity.

3 The system model

In this section, the system model, the security requirements, and the design goals are introduced.

3.1 System model

The main participants include a trusted authority center (AC), voting system (VS), voter (V), candidate (C), and bulletin board that is the information publishing platform. Their functions are described as follows:

  • AC:AC authorizes the legal voter to cast the ballot no more than once, and AC is responsible for arbitrating the disputes and issuing the digital certificate to each participant.

  • VS:VS generates the credential for V, and leaks nothing about the voters’ intention.

  • V:V selects the favorite candidate and gets the credential using VS.

  • C:C collaborates to tally the ballots to obtain the result with the help of VS.

The communication model is shown in Figure 1. An authorized voter V casts his ballot using the voting system (VS), and VS generates the corresponding credential for V, divides the voter’s masked intention data into m pieces d1, ⋯, dm, and sends dj to Cj, (j = 1, ⋯, m).

Figure 1
figure 1

The communication model

3.2 Trust assumption

In order to ensure the practicability, the following trust assumptions are necessary:

  • VS is assumed to execute functionally without being infected by the computer virus.

  • V is not assumed to be honest, he may sell his vote by proving the ballot’s content.

  • C is not assumed to be honest.

  • Adversary can obtain the data transmitted between VS and candidate through the communication channel, and launch the attack to destroy the data integrity.

3.3 Design goal

The necessary properties should be achieved, including the efficiency, the unconditional security, the universal verifiability and the coercion-resistance.

  • Unconditional security: Even though the adversary has enough computing power, he can’t infer any information about voter’s ballot.

  • Universal verifiability: Each voter can verify that his ballot is counted, and each candidate can verify if the result is correct.

  • Coercion-resistance: Each voter cannot prove to others which candidate he has casted.

  • Efficiency: The efficiency including the communication overhead and the computation cost should be lightweight.

When tallying the result, inside adversaries (also called “cheaters”) can deceive the honest shareholders by altering the shares. Many research papers [48,49,50] have been proposed to address the problems of cheater detection and identification. For instance, Xu et al. [48] assumes the cheat is less than one third, then, increasing the number of shares can address the problem of cheater detection and identification. In this paper, the situation that the malicious shareholder changes the share is not considered, which will be researched in the future.

4 The proposed e-voting scheme

The proposed voting protocol consists of Pre-voting Phase, Voting Phase and Post-voting Phase, and all computations are over Fp, where p is a secure prime. Before the Pre-voting phase, AC publishes p and the anonymity measurement k.

4.1 Pre-voting phase

Assume that there are n voters V1, ⋯, Vn, m candidates C1, ⋯, Cm, and n voters are divided into several sets, each set consists of k voters.

4.2 Voting phase

Voter casts his favorite candidates and gets the credential via a secure manner such as face to face, and it can be used to verify whether the ballot is counted or not. The voting phase is listed as follows.

  • Step1. When Vi, (i = 1, ⋯, n) registers to AC, VS issues a temporary ID to Vi, nobody knows the relationship between Vi and the temporary ID;

  • Step2. When the candidate Cj, (j = 1, ⋯, m) is selected, ai, j = 1, otherwise ai, j = 0. Then, VS generates a polynomial fi(x) = ai, 0 + ai, 1x + ai, 2x2 + ⋯ + ai, mxm mod p, where ai, 0 is a non-zero random number;

  • Step3. VS computes m + 2 shares (xj, yi, j), (j = 1, ⋯, m + 2), where xj, (j = 1, ⋯, m) is the identification of Cj, (j = 1, 2, ⋯, m), xm + 1, xm + 2 are the identifications of VS, Vi respectively, and yi, j = fi(xj). Then VS distributes (xj, yi, j) to Cj, (j = 1, ⋯, m), (xm + 1, yi, m + 1) is stored in VS, and Vi gets the credential CRi = {ai, 0, xm + 2, yi, m + 2}.

4.3 Post-voting phase

In Post-voting Phase, VS and all candidates reconstruct the polynomial and tally the result.

  • Step1. VS divides the voters randomly into some sets with k voters.

  • Step2. The temporary IDs of k voters in a set, for example, Vi, (i=2,⋯,k) is published, and these voters publish their ai, 0, (i=2,⋯,k) on the bulletin board;

  • Step3. Cj and VS compute \( {y}_j=\sum \limits_{i=1}^k{y}_{i,j} \), publish the points (xj, yj), (j = 1, 2, ⋯, m + 1), and each participant recovers F(x) = a0 + a1x + a2x2 + ⋯ + amxm mod p, where \( {a}_j=\sum \limits_{i=1}^k{a}_{i,j},j=0,1,\cdots, m \). Then, VS publishes the aggregated ballots {a0, a1, a2, ⋯, am} of k voters on the bulletin board;

  • Step4. If the sum of the published ai, 0, (i = 1, 2, ⋯k) does not to equal to a0, VS and all candidates are asked to check their publishing information, and reconstruct the polynomial again;

  • Step5. Everyone computes the result of Cj, votej =  ∑ aj, (j = 1, 2, ⋯, m).

Voting phase and the post-voting phase is shown in Figure 2.

Figure 2
figure 2

The proposed e-voting scheme

For more easily understand the proposed scheme, we assume that there are 20 voters Vi, (i = 1, 2, ⋯, 20) and 4 candidates Cj, (j = 1, 2, 3, 4), and p = 29, k = 10. Voters’ intention, the random number and the interpolation polynomial is showed in Table 2.

Table 2 Voters’ intention, the random number and polynomial

VS generates the shares. The shares and voters’ credential are showed in Table 3

Table 3 The shares and voters’ credential

Assume that VS selects a set with 10 members including V1,V3,V4,V8,V9,V13,V16,V18,V19,V20, and they publish their random numbers 3,5,3,5,3,8,12,1,8,9 on the bulletin board. Moreover, the sum of the shares from the V1,V3,V4,V8,V9,V13,V16,V18,V19,V20 is computed by all candidates and VS. The sum of the shares is showed in Table 4

Table 4 The sum of shares

After the sum is published, everyone generates a polynomial of degree 4 passing through five points (2,17), (5,13), (8,1), (9,22) and (11,28). Everyone computes the corresponding linear equations:

$$ \left\{\begin{array}{c}{a}_0+2{a}_1+4{a}_2+8{a}_3+16{a}_4=17\\ {}{a}_0+5{a}_1+25{a}_2+9{a}_3+16{a}_4=13\\ {}{a}_0+8{a}_1+6{a}_2+19{a}_3+7{a}_4=1\\ {}{a}_0+9{a}_1+23{a}_2+4{a}_3+7{a}_4=22\\ {}{a}_0+11{a}_1+5{a}_2+26{a}_3+25{a}_4=28\end{array}\right. $$
(1)

Then, they recover the aggregated polynomials F(x) = 28 + 6x + 7x2 + 5x3 + 7x4, and VS publishes {28, 6, 7, 5, 7} on the bulletin board. Thereafter, V1, V3, V4, V8V9, V13, V16, V18, V19, V20 verify if their ballots are counted correctly by checking the eq. 3 + 5 + 3 + 5 + 3 + 8 + 12 + 1 + 8 + 9 = 28 mod 29. In fact, all participants know that the results of C1,C2,C3,C4 are respectively 6, 7, 5, 7. After the sum share of another group is posted on the bulletin board, every participant knows that the results of C1,C2,C3,C4 are respectively 5, 3, 2, 8. Then, every participant computes the sum of each candidate to obtain the votes of C1,C2,C3,C4, they are respectively 11, 10, 7, 15.

5 Security analysis

The proposed scheme not only satisfies the correctness, unconditional security, anonymity, confidentiality, efficient, and non-cheating, but also achieves the universal verifiability and the coercion-resistance.

5.1 Correctness

Vi recovers the polynomial fi(x) using the random number on his credential and his intention, verifies if his ballot is correctly counted by checking the equation yi, m + 2 = fi(xm + 2), (i = 1, 2, ⋯, k).

In the post-voting phase, k voters publish a1, 0, a2, 0, ⋯, ak, 0 on the bulletin board. After recovering the aggregated polynomial F(x), voters verify \( {a}_0=\sum \limits_{i=1}^k{a}_{i,0} \). If it holds, the result is correct. Therefore, the correctness is achieved.

  • Scenario1. Nobody knows the voting result before the voting result is published.

Proof

Using secret sharing method, no information about the result can be obtained since the polynomial cannot be recovered with fewer than m candidates or VS. Therefore, nobody including candidates and VS can infer any information from his share. The result cannot be known until the polynomials are reconstructed and the coefficients of them are published.

5.2 Anonymity

Anonymity means that nothing about the voter’s information is leaked. Actually, AC does not know any information about the voter since voter uses a temporary ID in the proposed scheme. Some ballots are aggregated and posted in the post-voting phase, which masks the voter’s intention. Hence, the anonymity is achieved.

5.3 Confidentiality

According to Vi 's intention, VS generates a polynomial and divides it among m candidates and VS, thereafter, VS destroys the polynomial. Only m candidates and VS can recover the polynomial and get the aggregated ballots together, the content of voters’ ballot is confidential before it is published. The single ballot is still confidential after it is published since the aggregated ballots are posted on the bulletin board together, which guarantees the confidentiality.

5.4 Efficiency

The computation in the proposed scheme includes modular, addition and subtraction operation. Without using any complicated cryptography method, the proposed scheme achieves the efficiency requirement.

5.5 Unconditional security

Unconditional security means the security does not rely on the hard problem such as discrete logarithm and integer factorization. In fact, the unconditional security is especially vital for the voting scheme since the voting result should be confidential forever. In the proposed scheme, the shares are divided among the candidates and VS. Even if the malicious adversary has enough computing power, he can’t infer any information about the vote from some shares. Then, our scheme is unconditional secure.

5.6 Non-cheating

In the post-voting phase, the coefficient of aggregation polynomial will be published on the bulletin board. A dishonest candidate in the set will be detected when \( {a}_0\ne \sum \limits_{i=1}^k{a}_{i,0} \), then, the shares will be published and reconstructed again.

5.7 Universal verifiability

A polynomial can be recovered by using the random number on voter’s credential and his intention. If the polynomial passes through the share (xm + 2, y1, m + 2) on his credential, voter believes the credential to reflect his intention. Cj, (j = 1, 2, ⋯, m) verifies if his result is correct by checking yj = F(xj), (j = 1, ⋯, m). Everyone verifies that the result is correct by checking \( {a}_0=\sum \limits_{i=1}^k{a}_{i,0} \). Therefore, the scheme satisfies the universal verifiability.

5.8 Coercion-resistance

  • Scenario 2. The voter can’t prove the content of his ballot to others.

Proof

The credential CRi contains nothing about the intention of the voter. Even if the voter wants to prove his intention to others, he has nothing evidence. For example, V17 shows his credential to C4, and C4 obtains four polynomials f17(x) = 13 + x, f17(x) = 13 + x2, f17(x) = 13 + x3or f17(x) = 13 + x4, from V17 's credential. Therefore, C4 does not know whether V17 casts him or not. Then, the proposed voting scheme can resist coercion attack.

6 Conclusion

An unconditional secure e-voting scheme is proposed based on Shamir’s secret sharing and k-anonymity, in which the voting system generates a polynomial according to the intention of the voters, computes and divides the shares among candidates and VS. Candidates and VS reconstruct the polynomial and aggregate the ballot together. Moreover, the proposed scheme satisfies the correctness, efficiency, unconditional security, non-cheating, universal verifiability, confidentiality, anonymity, coercion-resistance.