1 Introduction

According to the latest research report by Counterpoint Internet of Things (IoT) servers, the global internet car market is expected to grow by 270% by 2022 [1]. As the number of vehicles increases exponentially [2], it brings people more convenient transportation. However, it also causes new problems, such as traffic safety, environmental protection, and privacy security [3]. Therefore, Vehicular Ad Hoc Network [4,5,6] (VANET) has become a hot research in the field of transportation. VANET is expected to change the existing problems of the current transportation system and realizes intelligent traffic management. In VANET, through the aggregation and constant exchange of Safety Beacon Messages (SBMs), all vehicles can receive safety information in a timely manner and be aware of the surrounding traffic environment, such as traffic flow, traffic congestion, etc.

The traditional architecture [7] of VANET is shown in Fig. 1, which mainly includes vehicles, RSU, CA, and core network server. SBMs are collected through the sensor on the vehicles, and transmitted to the upper layer by RSU. Finally, these SBMs are gathered to the core network for centralized processing. Therefore, the traditional architecture of VANET is based on centralized system, which is convenient for centralized management of vehicular information. However, traditional architecture must rely on trusted centralized entities (like CA in the Fig. 1). And it cannot guarantee that these centralized entities are completely credible. Once a centralized entity is attacked, it will bring serious data security risk. By mining SBMs, the attacker can get users’ private information, such as identity, location, social status, etc., which will bring life disturbance and even life security to users. In addition, with the continuous development of sensor technology and IoT technology, the data volume increases dramatically. The centralized management of data in traditional architecture leads to excessive load on central entities and bottleneck problems. Traditional architecture of VANET is also at risk of a single point of failure. Therefore, it is urgent to study the decentralized architecture of VANET to ensure the safety of users’ data.

Fig. 1
figure 1

Traditional architecture of VANET

In VANET, SBM contains vehicular identity, speed, location, content of request, etc. The identity and location information are particularly important [8,9,10]. The vehicular identity information is generally protected using pseudonym, which is generated by centralized entity CA in traditional VANET. However, once the CA is compromised, the vehicular identity privacy is threatened. Furthermore, all SBMs are generated based on location information. Only with accurate location information can VANET provide high-quality services to vehicles. The existing protection way of location information generally adopts encryption method, which can only guarantee that the location information will not be stolen or leaked easily in transmission process. However, it cannot guarantee that the location information will not be leaked by the CA or other servers in centralized architecture. Thus, the location privacy is threatened. Moreover, users are paying more and more attention to their private information in data era [11,12,13,14]. So it is very important to protect the vehicular identity and location privacy in VANET.

In this paper, to protect identity and location privacy, the decentralized architecture of VANET is constructed by using blockchain technology. Blockchain [15] technology is a kind of chain data structure, which is composed of data blocks connected sequentially according to time sequence. And blockchain owns a distributed ledger that cannot be tampered or forged using cryptography. Blockchain technology has typical features of decentralization, distribution, collective maintenance and non-tampering. It can effectively solve the problems of centralization, mutual distrust between entities and privacy leakage of traditional VANET. Therefore, a decentralized architecture is proposed to protect the identity and location privacy in VANET. The main contributions of this paper are as follows:

  • To address the problem of centralization in traditional VANET. This paper brings in blockchain technology and designs a novel decentralized architecture, blockchain-based VANET. The hash of SBMs is recorded in the blockchain of our proposed architecture, which not only ensues the integrity and security of SBMs, but also saves the blockchain storage and processing time.

  • To protect vehicular identity privacy, the identity of vehicle is divided into multiple sub-identities by the way of (m, r) threshold secret sharing scheme. And the sub-identities are updated periodically to prevent attacker from acquiring at most m sub-identities of vehicle to calculate vehicular real identity.

  • To protect vehicular location privacy, vehicles are united together to upload SBMs by the way of k-anonymity unity. This paper maps the group of vehicles into an undirected graph and quantifies the undirected graph with the parameter of that connectivityΔ and average distance\( \overline{D} \).

The remainder of this paper is organized as follows. In Section 2, we review related works about the privacy protection ways and blockchain applications in VANET Section 3 describes some relative definitions and basic concepts. Then Section 4 shows system model, threat model and system interactions of our proposed architecture. Section 5 gives the detailed description of privacy protection algorithms and their performance analysis. The simulation environment and results are given in Section 6. Finally, Section 7 makes a conclusion.

2 Related work

The related work presents two aspects: the privacy protection ways and blockchain applications in VANET.

2.1 VANET and privacy protection

In recent years, privacy protection has become a hot spot in study of VANET [16, 17]. According to the objects protected by researchers, we can divide the privacy protection into three types: location privacy protection [18,19,20,21,22], trajectory privacy protection [23,24,25] and identity privacy protection [26]. For example, the work in [22] dealt with the distribution of vehicular pseudonyms using fog computing technology. The edge resources of VANET were used to effectively manage pseudonyms, which can improve the ability of location privacy protection. In [25], the authors proposed a policy of trajectory privacy using the multiple mix zones. By constantly changing pseudonyms, it made the pseudonyms unlinkable and protected vehicular trajectory privacy.

From the perspective of protection means, the privacy protection is mainly divided into two types: anonymity-based and encryption-based privacy protection. Anonymity-based privacy protection often adopts k-anonymity mechanism [27, 28]. K-anonymity mechanism was originally created to protect users’ location privacy in LBS applications [29]. Nowadays, many researchers use k-anonymity mechanism in VANET. By the principle of maximum entropy, we find k appropriate vehicles with the closest historical request probability. Then the real vehicle is hidden in these k vehicles, and vehicular privacy is protected. While the encryption-based privacy protection [30] is commonly used in the authentication of VANET. Authentication is an important guarantee to receive SBMs from legitimate vehicles in VANET. It plays a vital role in privacy protection in VANET. At present the authentication of VANET generally adopts asymmetric encryption authentication methods, such as Public Key Infrastructure (PKI) and Elliptic Curve Digital Signature Algorithm (EDSA), and symmetric encryption authentication methods, such as group signature authentication mechanism [31].

2.2 Blockchain and VANET

Blockchain technology originated from an article titled “bitcoin: a peer-to-peer electronic cash system” written by Nakamoto in 2008. From the birth of blockchain to the present, blockchain technology has experienced three generations from 1.0 to 3.0 [32]. Blockchain 1.0 focuses on the transactions of digital currency. Since the birth of bitcoin, more than 600 cryptodigital currencies have been generated correspondingly (litecoin, dogecoin, YBcoin, etc.). These cryptodigital currencies mainly be used in small payments, foreign exchange, gambling and money laundering; Blockchain 2.0 focuses on the registration, validation, and transfer of smart contract. Smart contracts rely on oracles prediction and are mainly used in the financial field, such as securities trading, bank bills, payment and clearing, and supply chain finance. Blockchain 3.0 transcends the economic field, and its application field is not limited to finance, commodity transaction. The scope of blockchain 3.0 extends to government, medical care, science, culture, transportation, etc.

In recent years, a few researchers have introduced blockchain into VANET, considering with the characteristics of decentralization, redundant storage, collective maintenance and tamper-proof of blockchain. For example, Joy et al. [33] proposed the concept of blocktree. And the vehicle embedded its signature into the blockchain. Greg et al. [34] verified the feasibility of blocktree and analyzed the end-to-end delay and the time to collect, write and update blockchain contents. To better combine the blockchain with the VANET, Dorri et al. [35] proposed a lightweight scalable blockchain. And based on this, the reference of [36] proposed a blockchain architecture with decentralized privacy protection for intelligent vehicle systems. While Lei et al. [37] proposed a new network topology based on blockchain structure to simplify distributed key management in heterogeneous vehicle communication systems. Furthermore, the use of the blockchain as a means for privacy-preserving proof of location has been proposed in the work of [38].

3 Preliminaries

This section gives some basic concepts and definitions used in this paper: blockchain technology, dynamic threshold encryption, and k-anonymity unity.

3.1 Blockchain technology

Blockchain system adopts distributed structure, as shown in the Fig. 2. It consists of several decentralized nodes which cannot trust each other. Each node participates in data management. When a new block is confirmed by most of nodes (the number depends on different consensus mechanisms) in blockchain system, the new block is written into blockchain by miner. This is called as mining process. Then all nodes verify the content of that be newly added to blockchain, and have a complete copy of blockchain. Therefore, all nodes in the blockchain system jointly maintain the data information, which ensures the consistency, integrity and non-tampering of data.

Fig. 2
figure 2

Blockchain system

In this paper, blockchain technology is applied to VANET. The distributed features of the blockchain are used to remove the single trust center in traditional VANET. In this way, a novel decentralized architecture of VANET is constructed to realize the security and irreversibility of SBMs. However, it is precisely because all nodes participate in the joint maintenance of data information in blockchain. SBMs of vehicles become open and transparent, which is not conducive to identity and location privacy protection. And the two concepts of information transparency and privacy protection are contradictory. The more transparent information is, the more difficult privacy protection is. Therefore, in this paper, we adopt dynamic threshold encryption and k-anonymity unity to achieve identity and location privacy protection respectively.

3.2 Dynamic threshold encryption

Dynamic threshold encryption was proposed by Amir Herzberg [39]. It is based on (m, r) threshold secret sharing scheme. Thus, we present what (m, r) threshold secret sharing scheme is before introducing dynamic threshold encryption. (m, r) threshold secret sharing scheme first constructs a m-1 degree polynomial and takes secret s as constant term of polynomial. Suppose a finite field is GF (q), where q is a large prime number. Thus, the m-1 degree polynomial is expressed as:

$$ f(x)=s+{a}_1x+{a}_2{x}^2+...+{a}_i{x}^i+...+{a}_{m-1}{x}^{m-1},{a}_i\in GF(q) $$
(1)

Thus, we get that secret s is f(0) from formula (1). Select r elements {x1, x2, xi,…, xr} form GF (q) as the input of m-1 polynomial. Then we get r values {f(x1), f(x2),…, f (xi),…, f(xr)}. As each f(xi) implies the information of secret s, f(xi) is called a sub-secret of s. These sub-secrets are distributed to r different participants to keep. By Lagrange interpolation theorem, arbitrary m points {(x1, f(x1)), (x2, f(x2)),…, (xm, f(xm))}can recover f(x), as shown in formula (2):

$$ f(x)=\sum \limits_{i=1}^mf\left({x}_i\right)\prod \limits_{\begin{array}{l}j=1\\ {}j\ne i\end{array}}^m\frac{x-{x}_j}{x_i-x{}_j} $$
(2)

The secret s can be obtained through f(x):

$$ s=f(0)=\sum \limits_{i=1}^mf\left({x}_i\right)\prod \limits_{\begin{array}{l}j=1\\ {}j\ne i\end{array}}^m\frac{0-{x}_j}{x_i-x{}_j} $$
(3)

Therefore, from the above analysis, it can be concluded that (m, r) threshold secret sharing scheme is effective and secure as long as m or more sub-secrets cannot be obtained. However, we cannot guarantee that an attacker can only obtain sub-secrets within m-1. Thus, Amir Herzberg et al. improved the (m, r) threshold secret sharing scheme and made the sub-secret dynamic. The survival cycle of sub-secret is divided into different time periods. Then the sub-secret is updated at each time period. This way makes it difficult for an attacker to obtain m or more sub-secrets in a short time. The update process of sub-secret is as follows.

First, we divide the survival cycle of sub-secret into w periods {t1, t2,…, tw}. The sub-secret is assumed to be fi-1(x) at time ti-1. Then the sub-secret will be updated to fi(x) at time ti.

$$ {f}_i(x)={f}_{i-1}(x)+{\delta}_i(x) $$
(4)
$$ {\delta}_i(x)=\sum \limits_w\left(h{}_i(x)/\operatorname{mod}q\right) $$
(5)
$$ {h}_i(x)={a}_{i1}{x}^1+{a}_{i2}{x}^2+...+{a}_{i\left(m-1\right)}{x}^{m-1},{a}_{ij}\in GF(q) $$
(6)

To keep secret s invariable, the constant term of δi(x)must be zero.

Before updating the sub-secret:

$$ {f}_{i-1}(0)=s $$
(7)

After updating the sub-secret:

$$ {f}_i(0)={f}_{i-1}(0)+{\delta}_i(0)={f}_{i-1}(0)+0=s $$
(8)

Therefore, the update of sub-secret does not affect the change of secret s. The updated sub-secret still conforms to Lagrange interpolation theorem, and the secret s can be recovered by gathering m or more sub-secrets.

In this paper, the vehicle’s identity id is used as a secret. To protect id, it is divided into multiple sub-identities by threshold secret sharing scheme. When a vehicle uploads SBM, it joins other k-1 vehicles and collects their sub-identities as identity information using k-anonymity unity technology. Each transmission of SBM is regarded as a transaction, and the hash value of each transaction is recorded into the blockchain after being verified. If the verification fails, such as a vehicle intentionally falsifying sub-identity information, the corresponding vehicle will be removed from blockchain-based VANET.

Since the vehicular identity information is a combination of sub-identities of k vehicles in each transaction, the attacker cannot infer the real sub-identity of the vehicle. Furthermore, the sub-identity of vehicle is updated dynamically with the vehicular driving trajectory. Therefore, it is more difficult for attacker to acquire m or more valid sub-identities in a short time, and vehicular identity privacy can be protected.

3.3 k-anonymity unity

In this paper, to protect vehicular location privacy, the vehicle’s SBM is uploaded with other k-1 vehicular SBMs by adopting the concept of k-anonymity. The format of k united messages is shown in Table 1. From the Table 1, we can get that the format of k united messages contains three basic types of information: sub-identity id’, location l, and request content C.

Table 1 The format of k united messages

Therefore, it is difficult for an attacker to get vehicular real location information even if attacker steals SBMs of vehicles. The attacker guesses the true location information with the probability of at most 1/k.

Generally, a first-come, first-served method is used to unite other k-1 vehicles. This method is easy to operate. However, there is not a standard to judge whether this unity is reasonable or not. To better protect the vehicular location privacy, this paper proposes two indicators to measure the effectiveness of unity.

3.3.1 Connectivity Δ

To protect the vehicular location privacy, this unity is not valid when the united messages have the same information in each uploaded message of united vehicle. As shown in Fig. 3, assume that k is 4 and the k united vehicles are {A, B, C, D}.

Fig. 3
figure 3

The united vehicles construct a complete graph

The vehicles {A, B, C, D} are combined with each other. Then the united messages uploaded by each vehicle are the same, as shown in Table 2.

Table 2 The same information for each vehicle

To analyze united model among vehicles, this paper proposes graph analysis method. The vehicles are regarded as the nodes of graph. If two vehicles are united with each other, there exists an edge between the corresponding nodes. Then we can construct an undirected graph for united vehicles. From Fig. 3, we can conclude that the undirected graph cannot be a complete graph, otherwise the united message is same in each vehicle. As this paper adopts the method of k-anonymity unity, the number of united vehicles must be larger than k to ensure that the graph is not a complete graph, as show in Fig. 4. We also assume that k is 4. The united vehicles are {A, B, C, D, E, F}. And the number of united vehicles in Fig. 4 is n = 6 > k.

Fig. 4
figure 4

The united vehicles construct an uncomplete graph

Then the united message uploaded by each vehicle is different in Fig. 4. Table 3 presents the united messages of vehicles {A, B, C, D, E, F}.

Table 3 The different information for each vehicle

This paper uses variableΔto measure the connectivity of a vehicle V.

$$ \varDelta =\frac{num.(V)}{n}\ge \frac{k}{n} $$
(9)

Where n represents the total number of united vehicles in graph, and num. (V) is the number of united vehicles with the target vehicle V. To realize k-anonymity, the num. (V) is more than k. If the constructed graph is an unconnected, the split subgraph must not be complete. Otherwise there will be a large number of vehicles that upload the same messages. If the constructed graph is connected, certain nodes can form sub complete graph. For example, if vehicle B and vehicle C unite with each other in Fig. 4, the vehicles {A, B, C}, {D, B, C}, {E, B, C} separately constitute a sub complete graph as show in Fig. 5. However, the connectivity may be different for the vehicles in sub complete graph.

Fig. 5
figure 5

Certain vehicles construct a sub complete graph

The connectivity of vehicle A is calculated as:

$$ {\varDelta}_A=\frac{num.(A)}{n}=\frac{4}{6} $$
(10)

While the connectivity of vehicle B is represented by the symbol ΔB:

$$ {\varDelta}_B=\frac{num.(B)}{n}=\frac{5}{6} $$
(11)

Thus, the connectivity of vehicle B is greater than vehicle A.

$$ {\varDelta}_B>{\varDelta}_A\ge \frac{k}{n}=\frac{4}{6} $$
(12)

For the connectivity of a graph, it consists of the connectivity of all vehicles in the graph. And the greater the connectivity of the graph indicates the greater the similarity of the united messages for vehicles. Thus, the higher connectivity of a graph corresponds to better identity privacy protection for vehicles. This paper adopts average connectivity \( \overline{\varDelta} \) of all vehicles to represent the effect of identity privacy protection for a graph. Then the average connectivity is shown as follows:

$$ \overline{\varDelta}=\frac{\varDelta_1+{\varDelta}_2+...+{\varDelta}_n}{n}<1 $$
(13)

When the graph constructed by united vehicles is a complete graph, the average connectivity is up to the maximum value of 1.

3.3.2 Average Distance \( \overline{D} \)

To protect location privacy, the unity is not valid when the locations of united vehicles are adjacent, or even the same. In this paper, the location is represented by (x, y), where x and y are horizontal and vertical coordinates respectively. The average distance between vehicles in a graph is defined as follows:

$$ \overline{D}=\frac{\sum \limits_{i\ne j}\sqrt{{\left|{x}_i-{x}_j\right|}^2+{\left|{y}_i-{y}_j\right|}^2}}{C_n^2} $$
(14)

Where (xi, yi) and (xj, yj) respectively represent location information of any two vehicles in a graph. For the n vehicles in a graph, there are \( {C}_n^2 \) Euclidean distances between the n vehicles. Thus, the \( \overline{D} \) describes the average distance of the \( {C}_n^2 \) Euclidean distances. And we can get that the larger the average distance \( \overline{D} \) between vehicles, the better the effect of location privacy protection, and the more the effectiveness of k-anonymity unity. To prevent vehicles from approaching or being in the same location, a threshold value σ is set in this paper. The k-anonymity unity is effective when \( \overline{D}\ge \sigma \).

4 System model

We design an architecture of blockchain-based VANET in this section. Then we present components and interactions in blockchain-based VANET.

4.1 System model

For preserving the vehicular identity and location privacy, we adopt the construction approach of IoTchain architecture [40] to design a decentralized architecture of VANET, as shown in Fig. 6. We call it blockchain-based VANET, involving eight different components: vehicles, on board unit (OBU), road side unit (RSU), core network, blockchain network, smart contract, agent node, and miner.

  1. 1)

    Vehicles: Vehicles are the moving entities in the blockchain-based VANET. They are equipped with an OBU to communicate with each other. When a vehicle uploads SBMs, the vehicle will unite with other vehicles.

  2. 2)

    OBU: OBU is a kind of sensor device mounted on vehicle. The OBU can sense the environment, speed and location information of vehicle. Moreover, the OBU can communicate with the adjacent RSU, and send SBMs to RSU.

  3. 3)

    RSU: RSU is deployed on the side of road, acting as an AP (Access Point). It collects SBMs from OBUs, and then upload these BSMs to the core network via wire or wireless. RSU can communicate with OBU in real time, and assist vehicles to from groups. At the same time, RSU is responsible for transmitting traffic and service information to vehicles through broadcasting, such as weather conditions, real-time road conditions, emergencies, control information, service facilities, etc.

Fig. 6
figure 6

The decentralized architecture of VANET

  1. 4)

    Core network: It is composed of a large number of servers, such as CA server, data storage server, etc., with strong computing capacity, storage resources, and energy availability. All data is stored in the core network and processed by the core network. For the data security, these data stored in the core network is encrypted. In particular, it is pointed out that the CA proposed in this paper is a kind of weakened server compared with the traditional VANET. As the CA is just to generate a few parameters for dynamic threshold encryption, the CA server can be unreliable.

  2. 5)

    Blockchain network: Considering the privacy protection in the blockchain-based VANET, all participating nodes need to be strictly controlled. Therefore, this paper adopts private chain to build blockchain network. The hash values of all data in the core network are stored in blockchain network. This can guarantee data not to be tampered and manipulated by malicious attackers. Because, once the data has been changed, its hash value will be also changed. Blockchain is auditable so that the change will be discovered. Furthermore, storing its hash values can greatly save the storage resources of the blockchain network and prove the system response time.

  3. 6)

    Smart contracts: Smart contracts are pre-defined rules and terms that can be executed automatically. In this paper, we pre-write certain rules into smart contracts, such as authentication rules, k-anonymity unity rules, sub-identity generation rules, sub-identity dynamic change rules, SBMs recording rules, etc. Because smart contracts are defined entirely in code, they are automatically executed once the trigger condition is met, without human intervention. Therefore, smart contracts not only save transaction costs, but also improve accuracy and efficiency.

  4. 7)

    Agent node: Agent node is a participating node of the blockchain network. Each agent node should participate in the consensus and have the backup of data of blockchain network, so as to jointly maintain the correctness of transactions of the blockchain network. In this paper, agent nodes adopt the way of Proof of Work (PoW) for consensus.

  5. 8)

    Miner: If an agent node has solved the mathematical problem and has the rights of legal block keeping, this agent node is called miners. Miner involves in processing new block of blockchain, and writes verified data into new block. Then all agent nodes will update their backup of data in blockchain network. Thus, the miner node is a special agent node. It not only participates in the consensus of blockchain network but also is able to mine and validate the new blocks.

4.2 System interactions

The interactions of blockchain-based VANET is presented in Fig. 7. Here, we explain four mainly interactions between the different components in blockchain-based VANET.

  1. 1)

    Blockchain set-up: During this stage, the system is initialized. All agent nodes form blockchain network. Each agent node is equal and enjoys the same rights and obligations. Meanwhile, corresponding smart contract rules will be built in this stage. Agent node adds these rules of authentication, k-anonymity unity, sub-identity generation, sub-identity dynamic change, and SBMs recording to smart contract one by one. If rule is successfully formed, agent node will receive corresponding address from smart contract.

  2. 2)

    Registration of vehicles: Assume that a vehicle wants to enjoy the blockchain-based VANET. The vehicle first initiates a register request to CA server. Then the CA server will send the register request to smart contract. By the rules of authentication, smart contract validates its validity, and returns an address to the vehicle if the verification is successful. Otherwise, the unverified register messages will be recorded in the blockchain and broadcasted to blockchain network, making the blockchain-based VANET aware that the vehicle is invalid and illegal. After receiving the address, the vehicle constructs a m-1 degree polynomial f(x) = id + a1x + a2x2 + …… + am-1xm-1, and hides his/her identity id in the polynomial by id = f(0). Final, the vehicle calculates r sub-identities based parameter information provided by CA server.

Fig. 7:
figure 7

The interactions of blockchain-based VANET.

  1. 3)

    SBMs upload: For protecting vehicular identity and location privacy, k-anonymity unity is adopted in this paper. Assume that a vehicle wants to upload SBMs. The vehicle first forms a group automatically with other vehicles by the assistance of RSU. When the vehicle initiates request to upload SBMs, it selects other k-1 sub-identities in the group. Then the k-1 sub-identities and its own sub-identities are together constituted vehicle’s identity information. Then the vehicle uploads k merged messages to core network using the constituted identity. From the merged information, we can get that it contains k sub-identities, k locations and k request contents. Therefore, the servers of the core network cannot infer the relationship between the vehicle and its real location and identity, so as to protect identity and location privacy of the vehicle.

  2. 4)

    Blockchain record: During this stage, we proceed according to the period T. In a period T, it completes the process of blockchain record including three phases: new transaction creation, transaction verification for miner and backup for agent nodes. Here, assume that the processing of each message is a transaction, and the hash value of messages of in the core network is recorded into blockchain network. Thus, the hash of new transactions in a period T are received by agent node.

New transaction creation: Agent node broadcasts the newly generated transaction data to all nodes in the blockchain network. Each node stores the transaction data in a new block.

Transaction verification: As our proposed blockchain- based VANET does not involve tokens, this paper selects the consensus of Practical Byzantine Fault Tolerance to realize transaction verification. Each node participates in voting based on its own calculation force. When a node (that is miner) finds a proof of new block, it broadcasts the block to all nodes in the blockchain network.

Backup: The validity of the block is recognized by other nodes only if all transactions contained in the block are valid and have not existed before. And all nodes will make backup for the new block and update block chain.

4.3 Threat model

The CA and other servers are a semi-trusted entity. They can faithfully performs the work with other entities while probably being curious about vehicles’ privacy. They may be compromised, and leak certain sensitive information for profit. Thus, although the blockchain technology ensures that the vehicles’ SBMs are not tampered, the identity and location privacy are at risk of being compromised.

Identity privacy attack is one of the typical attack modes in vehicle registration process. When a vehicle wants to enjoy the blockchain-based VANET, the vehicle communicates with CA to get its legal identity information. Thus, the CA controls all identities of vehicles. The identity privacy is easy to be revealed.

Location privacy attack describes the fact that the k-anonymity unity is invalid, and an adversary could infer the location information of a target vehicle.

Therefore, for protecting identity and location privacy, we design corresponding privacy protection algorithms in the process of Registration of vehicles and SBMs upload. These algorithms will be introduced in detail in Section 5, including Undirected Graph Generation (UGG) algorithm, Identity Privacy Protection (IPP) algorithm based on dynamic threshold encryption, and Location Privacy Protection (LPP) algorithm based on k-anonymity unity.

5 Algorithm design

In this section, we detailedly introduce the three algorithms for identity and location privacy protection in blockchain-based VANET.

5.1 The undirected graph generation algorithm

To protect the privacy of vehicles, groups are built among vehicles with the assistance of RSUs. In this paper, we map a group into an undirected graph to quantitatively describe the group. Although the undirected graph is a static representation of the range of vehicles group at a given time, the generative process of undirected graph takes into that vehicles move with different velocities, as show the Fig. 8. The undirected graph generation is based on the location, speed and direction of the target vehicle.

Fig. 8
figure 8

The generative process of undirected graph

Algorithm 1 presents how to generate the undirected graph G. The input parameters of algorithm 1 are: target vehicle U, location l, speed v, time t, and parameters (m, r) for threshold secret sharing scheme. When a vehicle U initiates a request, the undirected graph G must be completed within time t. Otherwise, vehicle U reinitiates request and the previous request is invalid.

Here, we describe the generative process of undirected graph G. First, we initialize the range of a group is A\( \left(\overrightarrow{l}+\overrightarrow{v}\ast t/2,\overrightarrow{v}\ast t\right) \) basing on the location and speed of vehicle U, where \( \overrightarrow{l}+\overrightarrow{v}\ast t/2 \) is the center and \( \overrightarrow{v}\ast t \) is radius. Using this vector pattern to construct the group can prevent attacker from guessing vehicle U is in the center of A. And it can make sure that vehicle U is in the group within time t.

Then we calculate the number of vehicles in range A, represented by num. (A). When num. (A) > 2r, the undirected graph G will be built as shown at line 5–15 in algorithm 1. We initialize the G and select vehicle U as the initial vertex of G. An array visite[] is set to store the vehicles in G. Then traverse each vehicle Ui in visite[]. If new vehicle Uj is united with Ui, the vehicle Uj will be store in visite[]. We update the visite[] and continue to traverse until no new vehicles join visit[]. Thus, the undirected graph G can be built based visit[]. If G is not a complete graph, the G is successfully constructed and the average connectivity of G is calculated. Otherwise, we will change the size of the group range and modify the radius of range A from \( \overrightarrow{v}\ast t \)to \( \overrightarrow{v}\ast t\ast 2r/ num.(A) \).

Algorithm 1 describes the pseudo code of UGG algorithm.

figure f

5.2 The identity privacy protection algorithm

For protecting vehicular identity privacy, this paper adopts the way of dynamic (m, r) threshold encryption. A target vehicle U uploads SBMs by using the united sub-identities of r vehicles. Of course, these r vehicles must contain the vehicle U. To make it difficult for attacker to acquire at most m-1 sub-identities of vehicle U and calculate real identity of vehicle U, the sub-identity is updated in two forms, as shown the two for-loops in algorithm 2. Consider the vehicle U is on a driving trajectory Tr = {l1,…, ld}. In the first for-loop (at line 1), we generate an undirected graph Gi for each location li by calling UGG algorithm. When the vehicle U enters a different group, it will regenerate r sub-identities with the assistance of the CA server (at line 3–5). As the CA server is responsible for generating certain parameter information f(x) and GF(q) for sub-identity. The calculation of the specific sub-identity is the vehicle itself. Therefore, the CA cannot gain vehicular identity information. In the second for-loop (at line 7), the vehicle U will randomly update certain sub-identity with the frequency f. That is to say, f sub-identities of the r sub-identities will be updated in unit time R/v, where R is the moving distance and v is the speed in group Gi for vehicle U.

Algorithm 2 describes the pseudo code of UGG algorithm.

figure g

5.3 The location privacy protection algorithm

To protect location privacy, this paper adopts the method of k-anonymity unity. When a vehicle uploads SBM, the vehicle unites other k-1 vehicles and uploads k SBMs, which contains k locations. Thus, it can blur the link between the vehicle and its location. And attacker cannot get which location belongs to which vehicle. The purpose of algorithm 3 is to gain the appropriate k united vehicles set M. The average distance \( \overline{D} \)of the k vehicles must be greater than threshold value σ, preventing the k locations is too adjacent or ever the same.

First, it is assumed that a target vehicle U receives r-1 sub-identities form other vehicles in group G. We store the corresponding r-1 vehicles and vehicle U into an alternative set W. Select k vehicles from the set W, including vehicle U. Then we initialize the set M and store the k vehicles in M. We calculate the distance D between each of two vehicles in set M. And the number of distance is k(k-1)/2. Then we sort these k(k-1)/2 distances from small to large, and calculate the average distance \( \overline{D} \)of k vehicles. If \( \overline{D}\ge \sigma \), we succeed in finding the appropriate k vehicles. Otherwise, we replace the two vehicles {ui, uj} with the minimum distance. If the vehicle U belongs to one of the vehicles {ui, uj} (assume that U = ui), we select another vehicle from the set W to replace vehicle uj. If vehicle U∉{ui, uj}, we select two vehicles from W to replace the two vehicles {ui, uj}. Then the set M is updated, and we calculate average distance until the appropriate k vehicles are found.

Algorithm 3 describes the pseudo code of LPP algorithm.

figure h

5.4 Performance analysis

We analyze the performance of our proposed UGG, IPP and LPP algorithms in terms of the effectiveness of vehicle unity and privacy.

Vehicle unity: We first analyze the connectivity of vehicle unity. To get a high average connectivity \( \overline{\varDelta} \), the range of graph G ensures num. (A) > 2r in UGG algorithm. If the num. (A) ≤2r, the center of A will be changed with the vehicle’s speed and direction. Moreover, the radius of A is also changed with the vehicle density. Thus, the vehicle’s location, speed, direction and density are taken into account for vehicle unity. Second, we analyze the average distance \( \overline{D} \) of vehicle unity. By setting a threshold value σ, it ensure that the vehicles in a graph G are not lies in the same location. Thus, this paper guarantees the effectiveness of vehicle unity from average connectivity \( \overline{\varDelta} \) and average distance \( \overline{D} \).

Privacy: We first analyze the identity privacy protection for the IPP algorithm. Using the threshold secret sharing scheme, the identity of vehicle is divided into r sub-identities. Thus, the vehicle communicates with the CA using the sub-identities so that the CA cannot easily gain complete identity of vehicle. Furthermore, the sub-identities are changed through two different ways.

  1. 1)

    Between groups: when a vehicle enters a new group, all the previous r sub-identities are discarded, and the r sub-identities are generated again with the assistance of CA.

  2. 2)

    Inside the group: As the vehicle moves within the group, it randomly updates certain sub-identities with frequency f.

Therefore, it is difficult for an attacker to collect m sub-identity within a short period of time, thus unable to decrypt the real identity of the vehicle. Thus, the identity privacy can be effectively protected.

Second, we analyze the location privacy protection for the LPP algorithm. Using the method of k-anonymity unity, the vehicle real location is hide in k locations. Moreover, the k locations can hardly be identical as the average distance between vehicles must be greater than the threshold value σ. Thus, an adversary could not infer the location of vehicle, and the location information is effectively protected.

6 Simulation and results

The performance evaluation of our proposed architecture (the blockchain-based VANET) is conducted in this section. The first part describes simulation environment. And the last part analyzes simulation results using four metrics: system time, average distance, connectivity, and privacy leakage.

To identify the effectiveness of our proposed architecture, we compare it with centralized architecture [7] and distributed architecture [37]. The biggest difference between the distributed architecture and our proposed architecture lies in the story data. In blockchain network of distributed architecture, it directly storages the SBMs data. While in the blockchain network of our proposed architecture, we storage the hash values of the SBMs data.

6.1 Simulation Environment

To simplify simulation experiment, we regard our proposed architecture as two parts: VANET network and blockchain network. The VANET network mainly do that vehicles upload SBMs and the blockchain network is responsible for the transaction record. We use OPNET [41] and Ethereum [42] to simulate the VANET network and blockchain network respectively. Ethereum is the blockchain-based computing platform, which provides the power of smarts contracts and the PoW consensus. Thus, in our experiment, we use Ethereum platform to write rules (e.g. authentication rule, k-anonymity unity rule, sub-identity generation rule, sub-identity dynamic change rule, SBMs recording rule) into smart contracts. For blockchain network, using the PoW consensus to verify the new data block.

In our experiment, we simulate that a target vehicle U drives on a trajectory Tr, which including 10 locations {l1, l2,…,l10}. For privacy protection, a corresponding undirected graph will be generated for each location. Thus, there exists 10 undirected graphs {G1, G2,…, G10} for the vehicle U. To describe the performance accurately, we do the experiments for 8 times and calculate the average value including system time, average distance, connectivity, and privacy leakage.

Furthermore, to better present the performance of average distance \( \overline{D} \), we simulate two scenarios: light and heavy traffic. We set the heavy traffic scenario as 0.4 v/h (that is, 4000 vehicles pass the Tr in one hour), while the light traffic scenario as 0.15 v/h. All the simulation parameters is shown in Table 4 in detail.

Table 4 simulation parameters

6.2 Simulation Results

System time

For our proposed architecture and the distributed architecture [37], the system time mainly contains two parts: the blockchain processing time and SBMs processing time. However, the centralized architecture [7] only has the SBMs processing time.

Figure 9 plots the system time in terms of transactions number under different architectures. Transactions ranging from 1 to 1500 is set for testing the system time. The system time increases with the increase in the number of transactions. Although the system time of the centralized architecture has no blockchain processing time, it is the worst performance in the aspect of system time. Because the centralized architecture has many central entities, these central entities must wait for k SBMs for centralized process transactions. The process of waiting is time consuming. Comparing our proposed architecture with the centralized architecture, the system time of our proposed architecture is smaller triple than centralized architecture.

Fig. 9
figure 9

System time

Comparing our proposed architecture with the distributed architecture, we can get that our proposed architecture and the distributed architecture have the same system time when transactions number is smaller than 600. However, our proposed architecture have faster system time than distributed architecture when transactions number is larger than 600. The main reason is that we store the hash value of transactions to blockchain network in our proposed architecture. Thus the bigger the transactions number is, the more time will be saved.

In conclusion, our proposed architecture reduces the system time by using the blockchain technology.

Average distance

Figure 10 and Fig. 11 respectively depicts the average distance \( \overline{D} \) under the heavy and light traffic scenarios. Although the shape of average distance \( \overline{D} \) in terms of the undirected graphs is very similar, there exists a difference about 100 m for average distance between heavy and light traffic scenarios. The average distance \( \overline{D} \) is about 700 m in heavy traffic scenario, and 800 m in light traffic scenario. The average distance \( \overline{D} \) in our proposed architecture is greater than both the centralized and distributed architectures. Thus, it’s harder to get the same or adjacent locations in our proposed architecture. And our proposed architecture works well for location privacy protection.

Fig. 10
figure 10

Average distance for heavy traffic scenario

Fig. 11
figure 11

Average distance for light traffic scenario

In Fig. 10 and Fig. 11, there exist two lows at G3 and G7 in centralized and distributed architecture. We can get that the corresponding locations {l3, l7} may be crossroads. Because vehicles easily gather at crossroads, leading to a sharp increase in vehicles number. Thus, the average distance \( \overline{D} \)will drop dramatically at locations {l3, l7}. However, the line of average distance is smooth in our proposed architecture. Thus, whether in heavy or light traffic scenario, our proposed architecture is more stable and cannot be affected by the crossroads.

Figure 12 plots the number of vehicles at graphs {G1, G2,…, G10}. From the Fig. 12, we can get that the number of vehicles is greater than 2r (2r = 2*12 = 24). However, the number of vehicles is between 8 to 15 in distributed architecture, and 8 in centralized architecture. To get k = 8 appropriate united vehicles, the greater number of vehicles means better privacy protection. Thus, our proposed architecture shows its superior.

Fig. 12
figure 12

The number of vehicles

Furthermore, we validate the effectiveness of the privacy protection in terms of average connectivity \( \overline{\varDelta} \) and the probability of location leakage, as shown in Fig. 13 and Fig. 14, respectively.

Fig. 13
figure 13

The average connectivity

Fig. 14
figure 14

The probability of location leakage

As shown in Fig. 13, compared with centralized architecture, the average connectivity \( \overline{\varDelta} \) is about 0.9 in our proposed architecture, which is only a little smaller than in centralized architecture. Compared with distributed architecture, the average connectivity in our proposed architecture is greater than distributed architecture. \( \overline{\varDelta} \) = 1 means that the corresponding graph is complete graph and is not conducive to identity privacy protection. When \( \overline{\varDelta} \) is not equal to 1, the bigger the average connectivity is, the better privacy protection is. Thus, the average connectivity \( \overline{\varDelta}\approx 0.9 \) presents that our proposed architecture is superior to that of the centralized architecture and distributed architecture.

As shown in Fig. 14, for each graph, the probability of location leakage is 0.125 using centralized architecture and 0.075–0.0125 using distributed architecture. Whereas our proposed architecture can reduce the probability to about 0.03%. Therefore, the ability of location privacy protection has been greatly improved using our proposed architecture.

7 Conclusion

This paper has studied the decentralized architecture using blockchain technology in VANET and designs a system model of blockchain-based VANET including blockchain set-up, registration of vehicles, SBMs upload, and blockchain record. Using the blockchain-based VANET can effectively address the problems of centralization and distrust among the entities in VANET. There is no third central entity in our proposed system model. The hash of SBMs is recorded in blockchain, which can ensure the integrity of SBMs and increase data processing time. Then to protect vehicular identity privacy in blockchain-based VANET, the identity is divided to more than k sub-identities, which will be periodically updated adopting the way of dynamic threshold encryption. Moreover, this paper designs two indicators (connectivity and average distance) to measure the effectiveness of k-anonymity unity for location privacy preserving. The experimental results demonstrated that our blockchain-based VANET had high efficiency in system time and privacy protection.