1 Introduction

In recent years, Blockchain technology, the core and basic framework of Bitcoin, has rapidly gained popularity and more attention. Bitcoin is the first successful decentralized global digital cash system, and Blockchain is essentially decentralized P2P computing mode. Traditional transaction usually relies on third-party to solve trust problem and to facilitate the trades. However Blockchain establishes and maintains a global ledger that stores records for every transaction. Blockchain allows the trades mutually unbeknownst to transact securely without a centrally trusted intermediary [1,2,3,4,5]. On Blockchain, transaction records, called block data, are distributed to all nodes to validate and confirm; in other words, Blockchain requires computational resources, network resources and storage resources to replace the third-party in order to solve trust problem between trading partners [6,7,8]. At present, transactions on Blockchain can be validated and confirmed once every 10 min, and Blockchain can process only 8 transactions per second, so Blockchain is unsuitable for the high-frequency trades in business application [9]. The low ability of the business processing is a major hindrance to the broad adoption of Blockchain. How to organize all nodes to finish validation quickly involves Blockchain-based communication algorithms.

For Blockchain-based communication algorithms, there are two main issues to be considered. Firstly, a reasonable data transmission structure and an efficient communication mechanism are essential to Blockchain network performance. Secondly, a node may stop working either because of some internal reasons or some environmental issues like external physical damage. In order to deal with real chronic conditions, there must be some self-detection or self-configuration mechanism for the nodes. Therefore, proper data transmission scheme can help reduce the delay and increase the efficiency while considering the node failure of Blockchain network.

Node failure is a common phenomenon on Blockchain. Nodes may leave voluntarily or be failure randomly because of nodes’ selfishness and nodes’ mobility. Node failure partitions the network which may lead to an unstable network and degrade the communication performance. Therefore, this paper proposes a data transmission scheme considering node failure for Blockchain. The main contributions of this paper are presented as follows:

  1. 1.

    The multi-link concurrent communication tree model is established. Due to the heterogeneity of nodes, nodes are different in transmitting ability, so each node has multi-connections with other nodes for forwarding at the same time, and this model belongs to the multi-link communication which can make maximum use of nodes’ potential communication capacity. It is obvious that the efficiency of the multi-link communication is superior to that of the single-link communication (used in star topology or in linear topology). Due to the introduction of concurrent communication mechanism, communication not only depends on the source node, and other powerful nodes are assigned to undertake forwarding tasks. Several transmission tasks are forwarded at the same time, so this model has high concurrency. In addition, due to the adoption of tree structure, the failure of individual nodes only affects their sub-trees, and does not destroy the whole communication. So this model has good malfunction isolation.

  2. 2.

    Based on this model, the data transmission algorithm considering node failure for Blockchain is proposed. In order to avoid the event that nodes are failed during transmitting, the proposed algorithm requires that the source node must receive the confirmation message from the participating nodes before transmitting block data. This belongs to prevention strategy in advance. Then this algorithm exploits the idea of greedy to construct a communication tree. Powerful nodes are selected to locate in the top or the upper of communication tree, and weak nodes are located in the lower of communication tree even located as leaf nodes.

  3. 3.

    Theoretical proof and comprehensive simulation is done to evaluate the proposed algorithm. Though theorem proof, it is found the relationship between the number of failure nodes and the concurrent communication time, and when one node is failed the number of its cascaded failure nodes can be calculated under equal-link communication tree. The first simulation experiment is performance parameters comparison under different percentage of failure nodes. The results show that even if 15% of nodes are failed, better communication performance also can be obtained. However, the failure nodes account for 30%, the concurrent communication time and the average end-to-end delay increase obviously. The second simulation experiment is performance analysis under failure nodes with different transmitting capacity. It is concluded from simulation results that the failure nodes, which are poor in transmitting even have no ability, do not affect the communication performance. But nodes with strong transmitting ability are failed, the communication performance degrades greatly.

The rest of this paper is organized as follows. Section 2 discusses the related work. Section 3 establishes the multi-link concurrent communication tree model for Blockchain. Based on the model, data transmission algorithm considering node failure for Blockchain is described in Sect. 4. Section 5 gives theorems and simulation. Finally, conclusions and directions for future work are presented in Sect. 6.

2 Related Work

Blockchain is originated in 2008 from the paper “Bitcoin: A Peer-to-Peer Electronic Cash System” by a Japanese scholar under the pseudonym of Satoshi Nakamoto [9]. It is reported in 2016 by Mike Hearn, a major Bitcoin developer, that the Bitcoin currency is a failed project. But Blockchain as the underlying technology of Bitcoin has attracted more attention in recent years. Blockchain is an innovative application mode of Internet, which integrates with distributed data storage, P2P, encryption algorithm, consensus mechanism and other computer technology, etc. Blockchain has wide application prospects in finance, security investment, global payment, trade items, etc. [10]. Compared with vigorous development of Blockchain application, the basic research of Blockchain technology has lagged. By retrieving literatures, literatures are few on Blockchain, so Blockchain is a new technology and still in its infancy. There are many factors, such as storage capacity, processing power, and transmitting efficiency, blocking the development of Blockchain. This paper focuses on transmitting efficiency problem.

In Bitcoin, as the first typical application of Blockchain, block data generated by one node is distributed to other nodes using Flooding broadcast algorithms. Although flooding broadcast algorithms are widely used to finish routing function and are easy to implement, due to high-frequency trading of Blockchain, flooding may cause serious information redundancy, bring broadcast storms and result in network congestion [11, 12].

It is generally known Blockchain is essentially a P2P network in the view of the network structure. But it is different from P2P network and owns its new characteristics. Node mobility and high-frequency transactions make communication environment significantly more complex and harsh [13]. Therefore proper topology structure and communication mechanism are needed to schedule the routing for transaction validation.

Tree topology structure is often chosen to improve and optimize the communication efficiency [14, 15]. Paper [16] proposed a concurrent communication tree algorithm. With \(N\) nodes, the algorithm shrinks the communication time to \(O(\ln (N))\). Compared with the line and star structure, tree topology obtains better performance in communication efficiency. Using this algorithm, it is simple to construct a tree for communication. However, it is assumed that all the nodes are equally capable to perform the network transmitting task which is not true in the real environment. Paper [17] proposed a P2P communication tree algorithm based on multi-link. It introduces the concept of node communication link number, and assigns proper communication task to nodes according to their transmitting capacity in order to shorten the concurrent communication time. This algorithm considers the communication capacity of nodes, so the algorithm in Paper [16] is a kind of special case of this algorithm. Paper [18] proposed a IFT algorithm which integrates communication factors (such as node communication link number, node trust degree) and improves both the efficiency and the reliability of Blockchain communication. Paper [19] designed a stable clustering algorithm and a corresponding clustering routing protocol for large-scale mobile Ad Hoc networks. The proposed routing protocol uses proactive protocol between nodes within individual clusters and reactive protocol between clusters. The proposed clustering algorithm improves the stability by decreasing the clusters’ number and nodes’ switch times between clusters.

However, these algorithms mentioned above ignore the fact of node failure. On Blockchain, a node may stop working either because of some internal reasons (the running out of battery power, hardware malfunction, etc.) or some environmental issues (network congestion, the failure of link connections, physical external damage, etc.).

Paper [20] introduced the background and the causes of node failure of P2P network and then reviewed the main progresses of node failure prediction, node data recovery, super-node selection and routing node algorithm etc. Paper [21] proposed a localized hybrid timer based cut-vertex node failure recovery approach to handle partitions and restore connectivity with the use of multiple backup nodes for the guaranteed partitioned recovery.

The existing literatures proposed different strategies to solve node failure problem for a specific network environment (wireless sensor network [22,23,24], virtual network [25, 26], internet of things [27]). The existing algorithms or strategies mainly focused on routing recovery after node failure. And they may ignore the following (1) the relationship between the node failure ratio and the decline of communication efficiency; (2) the impact of communication efficiency under failure nodes with different transmission capacity; In addition, There is no considerable work have been seen dealing on optimized routing scheme with the transmitting efficiency problem with respect to Blockchain while considering the node failure as well. This is where our work comes into play and that is the main difference from previous work. Our work aims to provide a data transmission scheme considering node failure, which has high concurrency and high efficiency. It also can maintain good performance in communication even if there are 15% failure nodes. The proposed scheme gives a routing plan for Blockchain validation.

3 Multi-link Concurrent Communication Tree Model

3.1 Nodes’ Behavior Analysis on Blockchain

Nowadays, people can produce and consume information anytime and anywhere by mobile wireless networks, and more customers choose mobile payment technology to finish their transactions. Payment by mobile phones and PDA terminals, would be more convenient compared with the traditional means of payment. Since a large amount of mobile terminals rush into Blockchain and participate in transaction validation. Blockchain may face new problems and new challenges. On Blockchain, nodes’ behavior characteristics are as follows.

  1. 1.

    The dynamicity and mobility of nodes. More people resort to mobile terminals by wireless mode to transfer payment and participate in transaction validation. But on Blockchain mobile terminals move and change their physical location, even leave the Blockchain network. The mobility of nodes may result in the failure of sub-tree or the failure of route. Blockchain is a self-organized P2P network. We need to establish logical connections between nodes for transaction validation. The logical connections between nodes are based on the physical topology. So there exists a certain correlation between logical connections and actual physical topology. Mobile nodes frequently moving may lead to the topology mismatch problem. The established connections may be unavailable or even failure.

  2. 2.

    The selfishness of nodes. Because Blockchain adopts P2P network, Blockchain also advocates nodes’ collaboration partnership and sharing spirit. But nodes are naturally and essentially self-centered, and they act selfishly to maximize their own interest. Nodes ask for information or resource and don’t undertake the transaction validation and other communication tasks. Once they obtain information or resource, they abort or leave Blockchain. In the event that such nodes are selected to act as intermediaries between the source node to the destination node, their selfishness and unreliability can increase the rate of transmission failure.

  3. 3.

    The malfunction of nodes. If some failures (hardware malfunction, out-of-energy, auto power-off, and so on) occur, nodes can not repair themselves and require human intervention. They can not recover quickly, so these nodes become unavailable permanently for the whole transaction validation.

Through the analysis of nodes’ behavior characteristics, it is known that the node failure is unavoidable and unexpected on Blockchain. No matter what reason nodes are failed, this could result in the poor performance of the network, even sometimes the paralysis of the whole network. On Blockchain, different data transmission schemes are used to forward the block data. These transmission schemes must consider the node failure cases and how to handle them. This paper proposes a data transmission algorithm with node failure for Blockchain.

3.2 Node Classification and Transmitting Role

Nodes are classified into three types by the role in transmission task. They are source node, forwarding nodes and non-forwarding nodes respectively. Node classification and transmitting role is shown in Table 1. In the case of the forwarding nodes’ failure, it may result in the failure of sub-trees, and affect the efficiency of transaction validation. Therefore, this paper just needs to consider the failure of the forwarding nodes.

Table 1 Node classification and transmitting role

3.3 Establishing Model

The tree topology has been widely used in P2P networks and performs the feasibility and effectiveness in data forwarding, information sharing and resource searching [14,15,16,17,18]. The tree topology has good malfunction isolation and good pollution isolation. An acyclic tree topology can reduce redundant retransmission and network overhead. In order to finish the transaction validation of Blockchain, according to the physical network topology, a tree rooted from a communication source node is established by all other nodes.

Figure 1a shows physical topology of Blockchain. Blockchain is a distributed P2P network which is composed of nodes, called miners in Bitcoin system. On Blockchain, nodes can be regarded as mobile terminals, PCs, servers or a miner pool which is made up of any number of nodes. To be one node, it requires that the node must have computing power regardless for its physical hardware configuration. All nodes of Blockchain are distributed in the broad range of geographical area, and they are meshed together physically by wired or wireless connections.

Fig. 1
figure 1

Multi-link concurrent communication tree model. a  the physical topology of Blockchain, b the multi-link concurrent communication tree

Figure 1b shows multi-link concurrent communication tree. All nodes are abstracted logically into tree topology structure to finish transmission. This topology is a self-configuring network of nodes. All nodes cooperatively maintain network connectivity without the aid of any fixed infrastructure units such as routers, switchers, or access points. Each node must have routing function and forwards block data via intermediate nodes.

In Fig. 1a the lines between nodes belong to physical connections for actual network and in Fig. 1b the directed lines between nodes belong to logical connections for finishing an assigned communication task. The number on line stands for the sequential order of transmitting.

How to construct a communication tree depends on algorithms. This is where we focus on. The principle of establishing communication tree is to reach load balance and to shorten communication delay. For the communication tree with \(N\) nodes, a good communication tree established based on algorithm is relatively low height, stout, and with spreading branches. That is, the tree has higher concurrency and shorter communication delay. To achieve this aim, make sure that nodes located in the top or the upper of the communication tree have powerful transmission capacity. That is, nodes closer to the source node are expected to have powerful transmitting ability. Because of concurrent communication mechanism, these nodes can undertake more communication tasks in follow-up communication.

On Blockchain all nodes are different in transmitting ability. We continue to use the communication link number for nodes in order to represent the heterogeneity of nodes’ communication capacity on Blockchain-based communication [18]. So the proposed model belongs to multi-link.

In the multi-link concurrent communication tree model, the following assumptions are taken without losing generality:

  1. 1.

    Communication task (except for the routing information) between any two nodes are the same. For a transaction validation of Blockchain, the source node \(v_{0}\) must distribute the block data to all other nodes for validation, so all nodes receive the same block data.

  2. 2.

    Any node \(v_{i}\) receives the block data and quickly forwards it to other nodes without delay. So the proposed model is non-intermittent.

  3. 3.

    Several communication tasks are executed and transmitted at the same time, so the model is concurrent.

4 Data Transmission Scheme Considering Node Failure

This paper proposes a data transmission algorithm considering node failure, which consists of two steps: detecting failure nodes and establishing communication tree. Before establishing the communication tree, detecting failure nodes in advance belongs to preventive measures to avoid nodes’ failure during transmitting. The detecting method is that the source node sends a confirmation message to all nodes of Blockchain before establishing the communication tree. Blockchain sets a response threshold denoted by \(T_{\text{threshold}}\). If the response time which the source node \(v_{0}\) spends to receive the response message from nodes is longer than the given threshold, and nodes become non-responsive because of the failures or faults, the source node \(v_{0}\) regards these nodes as the failure nodes. \(T_{\text{threshold}}\) is pre-fixed for a specific Blockchain transaction. In private network, the \(T_{\text{threshold}}\) can be ignored [16].

Before establishing the tree, detecting all the participating nodes and identifying the failure nodes might be time-consuming. But it is necessary to do that, because it reduces the risk of nodes’ failure in the process of transaction validation, and improves the reliability of transmitting.

After these detecting steps, we can get two node sets, which are the failure node set \(V_{F}\) and the unfailure node set \(V_{U}\). The nodes in \(V_{U}\) can undertake transmitting tasks, so they are suggested to locate in the top or the upper of communication tree. The nodes in \(V_{F}\) should not be assigned tasks, so they are located as leaf nodes. In order to maximize the potential communication capacity of nodes and to shorten the validation time of Blockchain transaction, greedy mechanism is adopted to construct communication tree. The selection order of nodes to be added to communication tree is calculated based upon their capability in transmitting. So the powerful nodes are located in the upper level of communication tree, and they can undertake more communication tasks in follow-up communication.

Based on the actual communication information, such as the geographical location information of nodes, the connections between nodes, and the transmitting ability of nodes, Blockchain network is abstracted as graph \(G = (V,E,L)\) with node set \(V\), edge set \(E\), communication link number set \(L\). \(V = \{ v_{0} ,v_{1} , \ldots ,v_{i} , \ldots ,v_{n} \}\),\(V_{F} \cup V_{U} = V\), \(E = {\text{\{ }} \langle v_{0} ,v_{1} \rangle , { } \langle v_{0} ,v_{2} \rangle , \ldots , \langle v_{i} ,v_{j} \rangle , \ldots , \langle v_{n} ,v_{n - 1} \rangle {\text{\} }}\), An edge \(\langle v_{i} ,v_{j} \rangle\) represents there exists the logical connection between \(v_{i}\) and \(v_{j}\). \(\langle v_{i} ,v_{j} \rangle \in E\). \(L = \{ l(v_{0} ),l(v_{1} ), \ldots ,l(v_{i} ), \ldots ,l(v_{n} )\}\), where \(l(v_{i} )\) represents the maximum communication link number for \(v_{i}\). Set \(V_{T}\) and \(E_{T}\) represent node set \(V\) and edge set \(E\) for communication tree. The complement set of \(V_{T}\),\(E_{T}\), \(L_{T}\) is \(\overline{{V_{T} }}\), \(\overline{{E_{T} }}\), \(\overline{{L_{T} }}\) respectively. That is \(\overline{{V_{T} }} \cup V_{T} = V\), \(\overline{{E_{T} }} \cup E_{T} = L\), \(\overline{{L_{T} }} \cup L_{T} = L\).\(f(t)\) is the concurrent communication time [18]. The algorithm of constructing the communication tree is described below.

  • Step 1 Sort the set \(V_{U}\) by \(l(v_{i} )\) from largest to smallest, and obtain an ordered set \(V_{U} = {\text{\{ }}v_{1} , \dots ,v_{i} , \dots ,v_{j} , \dots ,v_{n} {\text{\} }}\), \(\forall v_{i} ,\forall v_{j} \in V_{U}\) and \(l(v_{i} ) > = l(v_{j} )\).

  • Step 2 The source node is \(v_{0}\), and the ordered set \(V_{U} = {\text{\{ }}v_{1} , \ldots ,v_{i} , \ldots ,v_{j} , \ldots ,v_{k} {\text{\} }}\).

    Let \(V_{T} = \{ v_{0} |v_{0} \in V\}\),\(E_{T} = {\text{\O }}\), \(L_{T} = \{ l(v_{0} )|l(v_{0} ) \in L\}\). So \(\overline{{V_{T} }} = V_{U} \cup V_{F}\), \(\overline{{L_{T} }} = \{ l(v_{1} ), \dots ,l(v_{i} ), \dots ,l(v_{n} )\}\),

    Let \(V_{Tmp} \Leftarrow V_{T}\),\(f(t) = 0\).

  • Step 3 k = 0.

  • Step 4 Traverse node \(v_{i}\) in \(V_{Tmp}\),\(\forall v_{i} \in V_{Tmp}\) and \(\forall v_{i} \notin V_{F}\) select \(m\) nodes \(\{ v_{1}^{i} , \dots ,v_{k}^{i} , \dots ,v_{m}^{i} \}\) from the ordered set \(\overline{{V_{T} }}\) in turn, where \(m = Min\{ Num(\overline{{V_{T} }} ),l(v_{i} )\}\),\(v_{k}^{i} \in \overline{{V_{T} }}\),\(v_{i}\) transmits data to \(v_{k}^{i}\), so \(v_{i}\) is the parent node of \(v_{k}^{i}\). \(Num(X)\) is the number of elements of set \(X\).

    Let \(V_{T} \Leftarrow V_{T} \cup \{ v_{1}^{i} , \dots ,v_{k}^{i} , \dots ,v_{m}^{i} \}\), \(E_{T} \Leftarrow E_{T} \cup \{ \langle v_{i} ,v_{1}^{i} \rangle , \dots , \langle v_{i} ,v_{k}^{i} \rangle , \dots , \langle v_{i} ,v_{m}^{i} \rangle \}\), \(L_{T} \Leftarrow L_{T} \cup \{ l(v_{1}^{i} ), \dots ,l(v_{k}^{i} ), \dots ,l(v_{m}^{i} )\}\).

  • Step 5 If \(k < Num(V_{Tmp} )\) and \(Num(V_{T} ) < N\), k++, go to Step 4.

  • Step 6 If \(Num(V_{T} ) < N\),\(V_{Tmp} \Leftarrow V_{T}\), \(f(t) = f(t) + 1\). go to Step3, otherwise, algorithm ends.

The communication tree constructed based on the proposed algorithm has some characteristics. The upper the node \(v_{i}\) is located in, the larger the \(l(v_{i} )\) is. So the leaf nodes are poor in transmitting or have no ability to forward data. In addition, due to multi-link concurrent communication tree model, the communication tree is stout and has more spreading branches, which means diverse routes for transmitting. It can realize load balancing or fault tolerance.

5 Theorems and Simulation

5.1 Theorems for Communication Tree

Theorem 1

For a single-link communication tree with \(N\) nodes, if the number of failure nodes denoted by \(N_{Fail}\) satisfies \(0 \le N_{Fail} < N - 2^{(f(t) - 1)}\), \(f_{Fail} (t) = f(t)\), where \(f_{Fail} (t)\) indicates the concurrent communication time with considering node failure and \(f(t)\) indicates the concurrent communication time without considering node failure.

Proof

According to the concurrent communication mechanism, the number of nodes in communication tree grows exponentially with time. For the communication tree of \(N\) nodes without considering node failure, \(\, f(t) = \left\lceil {\log_{2} N} \right\rceil\), where \(\left\lceil X \right\rceil\) indicates the smallest integer which is bigger than or equal to \(X\). After the (\(f(t)\) − 1)th time compartment, there are \(2^{(f(t) - 1)}\) nodes in communication tree. In the \(f(t)\)th time compartment, \(N - 2^{(f(t) - 1)}\) nodes are added to the communication tree as leaf nodes. So If the number of failure nodes is no more than \(N - 2^{(f(t) - 1)}\), the concurrent communication time \(f_{Fail} (t)\) is equal to \(f(t)\).

When the number of failure nodes is equal to \(N - 2^{(f(t) - 1)}\), the concurrent communication time \(\, f(t) = f(t) - 1\). We generalize Theorem 1 in the case of multi-link condition and obtain Theorem 2.

Theorem 2

For a multi-link communication tree with \(N\) nodes, the communication link number of any node \(v_{i}\) keeps at \(m\), which means \(l(v_{i} ) \equiv m\). There exists a natural number \(k\), \(1 \le k \le f(t)\). If \(N - (m + 1)^{(f(t) - k + 1)} \le N_{Fail} < N - (m + 1)^{(f(t) - k)}\) and \(0 \le N_{Fail} \le N\), then \(f_{Fail} (t) = f(t) - k + 1\).

Proof

Proving of Theorem 2 is the same as that of Theorem 1, so it can be omitted.

Applying Theorem 2 above, Let \(m = 1\) and \(k = 1\), and Theorem 1 can be obtained. Therefore Theorem 1 is regarded as a special case of Theorem 2. Theorem 2 shows the relationship between the number of failure nodes and the concurrent communication time.

Theorem 3

For a single-link communication tree with \(N\) nodes, if the failure node \(v_{i}\) is located at the depth \(Level(v_{i} )\) of the communication tree, the number of all unavailable nodes in its following sub-tree is denoted by \(N_{Fail} (v_{i} )\) , which satisfies \(2^{{(\left\lceil {\log_{2} N} \right\rceil - Level(v_{i} ) - 1)}} \le N_{Fail} (v_{i} ) \le 2^{{(\left\lceil {\log_{2} N} \right\rceil - Level(v_{i} ))}}\) , where \(Level(v_{i} )\) is the depth of node \(v_{i}\) in communication tree.

Proof

The concurrent communication time \(f(t)\) is \(\left\lceil {\log_{2} N} \right\rceil\) for a single-link communication tree. The depth \(Level(v_{i} )\) means that node \(v_{i}\) has received data after \(Level(v_{i} )\) time compartments. In the next \((\left\lceil {\log_{2} N} \right\rceil - Level(v_{i} ) - 1)\) time compartment, the node \(v_{i}\) is fully connected with other nodes and the communication number of node \(v_{i}\) reaches the maximum value \(l(v_{i} )\). But in the last \((\left\lceil {\log_{2} N} \right\rceil - Level(v_{i} ))\) time compartment, the node \(v_{i}\) may be not needed to transmit to others. So according to the concurrent communication mechanism, if the node \(v_{i}\) is failed, \(2^{{(\left\lceil {\log_{2} N} \right\rceil - Level(v_{i} ) - 1)}} \le N_{Fail} (v_{i} ) \le 2^{{(\left\lceil {\log_{2} N} \right\rceil - Level(v_{i} ))}}\).

We expand Theorem 3 to more general case of multi-link condition and obtain Theorem 4.

Theorem 4

For a multi-link communication tree with \(N\) nodes, the communication link number of any node \(v_{i}\) keeps at \(m\), which means \(l(v_{i} ) \equiv m\). If the failure node \(v_{i}\) is located at the depth \(Level(v_{i} )\) of the communication tree, the number of all unavailable nodes in its following sub-tree is denoted by \(N_{Fail} (v_{i} )\), which satisfies \((m + 1)^{{(\left\lceil {\log_{(m + 1)} N} \right\rceil - Level(v_{i} ) - 1)}} \le N_{Fail} (v_{i} ) \le (m + 1)^{{(\left\lceil {\log_{(m + 1)} N} \right\rceil - Level(v_{i} ))}}\).

Proof

Proving of Theorem 4 is the same as that of Theorem 3, so it can be omitted.

According to Theorem 4, the node \(v_{i}\) is leaf node, and its depth \(Level(v_{i} )\) is equal to \(\left\lceil {\log_{(m + 1)} N} \right\rceil\). So if leaf nodes are failed, they do not affect other nodes. The node \(v_{0}\) is the source node and \(Level(v_{0} )\) is 0. So if the node \(v_{0}\) is failed, it has an impact on the whole communication tree. Through the above analysis, it is known that the deeper of depth the node \(v_{i}\) is located at, the less impact on the whole tree.

According the Theorems 3 and 4, if one node is failed, we can calculate the number of its cascaded failure nodes in the case of \(l(v_{i} ) \equiv m\).

We obtain these Theorems in the case of \(l(v_{i} ) \equiv m\) which has the certain specificity. However, if \(l(v_{i} )\) is different, we only depend on simulations to evaluate performance of the proposed algorithm. Next is simulation experiment. The proposed algorithm is implemented and validated in C# on Microsoft Visual Studio 2015.

5.2 Performance Parameters Comparison under Different Percentage of Failure Nodes

On Blockchain, due to the mobility, selfishness and malfunction of nodes, nodes may be failure. Node failure affects the efficiency of communication. Here is a problem—what percentage of failure nodes on Blockchain network does not affect communication efficiency. Below we analyze communication performance parameters with diffierent percentage of failure nodes on Blockchain-based communication.

Figures 2, 3, 4, 5, and 6 show the simulation results of performance parameters under different percentage of failure nodes. The four different colour lines represent the result under different percentage of failure nodes. The X axis signs node number for 5, 10, 30, 50, 80, 100, 150, 200, 250, 300, 500 and 1000.

Fig. 2
figure 2

Concurrent communication time \(f(t)\)

Fig. 3
figure 3

Average end to end delay \(f_{\text{AED}} (t)\)

Fig. 4
figure 4

Average link stress \(A{}_{ls}\)

Fig. 5
figure 5

Concurrent degree \(C_{d}\)

Fig. 6
figure 6

Node utilization rate \(U_{r}\)

In actual communication, the few nodes have powerful capacity, and most of nodes are poor in transmitting, even have no ability to forward. Therefore, nodes whoes communication link number value is 3, 2, 1, 0, are randomly selected in proportion of 5, 10, 45 and 40% respectively. In addition, Failure nodes are randomly selected in certain proportion 0, 5, 15, 30 of \(N\) respectively. Failure nodes whoes \(l(v_{i} )\) is 3, 2, 1, 0 are randomly selected in proportion of 20, 30, 50 and 0% respectively.

Figure 2 shows the concurrent communication time \(f(t)\) under different percentage of failure nodes. The concurrent communication time, denoted by \(f(t)\), is the most important parameter to measure communication efficiency. \(f(t)\) is the elapsed time consumed finishing the whole transmission. In Fig. 2, \(f(t)\) doesn’t increase significantly along with the increasing of nodes \(N\), and it is relatively flat and has a small increase. The reason is that the proposed data transmission scheme follows the multi-link concurrent communication tree model which means that every node makes its best effort to forward and all nodes who have received the data must forward data at the same time. In simulation, we only select 5% of \(N\) as the powerful nodes whose \(l(v_{i} )\) is 3. But only 5% powerful nodes can perform their amazing abilities to forward the block data to other nodes, and \(f(t)\) keeps a slight growth compared with the increasing of nodes \(N\). Because in the communication tree established by the proposed data transmission scheme, powerful nodes are selected to locate in the top or the upper of the communication tree, so the concurrency and the efficiency is high. In addition, in the case that there are no failure nodes, that is, failure nodes account for 0% of \(N\), this is the ideal situation and \(f(t)\) can reach optimal value, as described in paper [17]. From Fig. 2, it is clear that \(f(t)\) under 5% failure nodes is equal to the optimal value, and \(f(t)\) under 15% failure nodes is only slightly above the optimal value. This reflects it does not matter to \(f(t)\) if 15% of \(N\) are failed in communication tree. However, if the failure nodes reach 30%, \(f(t)\) has a dramatic increase and it is prolonged by about 50% compared with the optimal value.

Figure 3 shows the average end to end delay \(f_{\text{AED}} (t)\) under different percentage of failure nodes. The average end to end delay, denoted by \(f_{\text{AED}} (t)\), reflects the average delay spent by all nodes. If the average end to end delay \(f_{\text{AED}} (t)\) is lower, and the performance of the proposed data transmission scheme is better. From Fig. 3, the \(f_{\text{AED}} (t)\) has an increase with the increasing of nodes \(N\) for the whole. The fact that two lines under 0 and 5% failure nodes almost appear as one reflects the \(f_{\text{AED}} (t)\) does not increase in the case of 5% failure nodes. If 15% of nodes are failed, the \(f_{\text{AED}} (t)\) has a smaller increase. Yet the \(f_{\text{AED}} (t)\) under 30% failure nodes is significantly higher than that under other percentage of failure nodes. It means that if failure nodes account for 30%, the \(f_{\text{AED}} (t)\) is too long for the whole communication.

Figure 4 shows the average link stress \(A{}_{ls}\) under different percentage of failure nodes. The average link stress, denoted by \(A{}_{ls}\), stands for the average route traffic load. To finish the transaction validation, the amount of communication tasks must be constant, so the more the routes, the less the average link stress. From Fig. 4, in the case of the same number of nodes, an increase in the node failure ratio leads to an increase in the \(A{}_{ls}\). The \(A{}_{ls}\) is the highest under 30% failure nodes, and the routes are busiest if 30% of nodes on Blockchain are failed. In addition, the \(A{}_{ls}\) increases smoothly along with the increasing of nodes \(N\).

Figure 5 shows the concurrent degree \(C_{d}\) under diffierent percentage of failure nodes. The concurrent degree, denoted by \(C_{d}\), is an important performance parameter to measure concurrency. From Fig. 5, the \(C_{d}\) increases sharply with the increasing of nodes \(N\). The more the nodes, the higher the concurrency. For the fixed number of nodes, the \(C_{d}\) under 30% failure nodes is smaller than that under other percentage failure nodes. The reason is that there is not enough nodes to participate in transmitting.

Figure 6 shows the node utilization rate \(U_{r}\) under different percentage of failure nodes. The node utilization rate, denoted by \(U_{r}\), means how many nodes to be selected to undertake communication forwarding tasks. In Fig. 6, the \(U_{r}\) is between 20 and 45%, and varies in an irregular way with increasing of nodes \(N\). In the case of 30% nodes failure, the \(U_{r}\) changes smoothly along with the increasing of nodes \(N\). The reason is that failure nodes account for 30% and non-forwarding nodes may account for 40%, so there is no other nodes to be selected and the \(U_{r}\) is around 30%.

5.3 Performance Analysis under Failure Nodes with Different Transmitting Capacity

Fixed terminals (desktop computer, all-in-one machine, server, workstation, etc.) and mobile terminals (mobile phone, iPad, notebook, etc.) are regarded as nodes on Blockchain. Because of different nodes equipped with different hardware configuration, the nodes on Blockchain are heterogeneous in forwarding capacity. The failure and the invalidation of different nodes have different impact on the Blockchain validation. If powerful nodes are failed, that degrades communication performance, even aborts the communication. When weak nodes are failed, that may have a small impact on Blockchain performance. Next, we study how different nodes impose their failure influence on communication performance.

Figure 7 shows the average end to end delay \(f_{\text{AED}} (t)\) under failure nodes with different transmitting capacity. We assume that the proportion of failure nodes is 10%. We randomly select weak nodes (\(l(v_{i} ) \equiv 1\)), moderate nodes (\(l(v_{i} ) \equiv 2\)), powerful nodes (\(l(v_{i} ) \equiv 3\)) to get them failed respectively. Why don’t we select nodes whose \(l(v_{i} )\) is zero as the failure nodes? Because these nodes located as the leaf nodes in communication tree, don’t undertake forwarding tasks, even if they are failed, they only affect their own and don’t impact the overall communication performance.

Fig. 7
figure 7

\(f_{\text{AED}} (t)\) under failure nodes with different transmitting capacity

It is well known that there are no failure nodes on Blockchain, the communication performance is optimal and it is an ideal state. From Fig. 7, when all failure nodes whose \(l(v_{i} )\) is 1 are weak nodes, the \(f_{\text{AED}} (t)\) is close to or reach the optimal performance. When all failure nodes’ \(l(v_{i} )\) is 2, the \(f_{\text{AED}} (t)\) has a much smaller increase than the optimal performance, and the increasing proportion of average end to end delay \(f_{\text{AED}} (t)\) is unstable, ranging from 4 to 31%. While powerful nodes whose \(l(v_{i} )\) is 3 are not available, the \(f_{\text{AED}} (t)\) has a significant increase, and it is prolonged by about 30%. So whether powerful nodes are failed or not, they are vital to the overall communication performance.

6 Conclusions

Based on the multi-link concurrent communication tree model, a data transmission algorithm considering node failure is proposed for giving the routing scheme on Blockchain. The proposed algorithm can improve the communication efficiency through maximizing the nodes’ transmitting capacity. The performance of the proposed algorithm has been validated through theoretical proof and simulation experiments. The results show that the proposed algorithm still works effectively even though the failed nodes reach 15%. When the failed nodes account for 30%, the concurrent communication time, the average end to end delay and the average link stress have increased greatly. In addition, the failure of weak nodes has a relatively small effect on communication performance, and the failure of powerful nodes leads to poorer performance. In the actual communication, as long as individual powerful nodes must be safeguarded and well maintained and they are not failed, the performance of Blockchain-based communication will not degrade greatly. In future, how to select a backup node and how to replace the failure node is an issue worth of research. Moreover, according to this new approach, we will do a deeper research on the real-world implementations for Blockchain.