1 Introduction

With the development tendency of wireless networks, the research of heterogeneous wireless networks (HWNs) which meet the communication aspiration of the seamless connection has been a promising field [13]. As a revolutionary technology, multipath routing can be used in HWNs to improve the transmission efficiency, which outperforms the single-path routing technology in the performance of the transmission success rate, end-to-end delay, network lifetime, additional overhead and so on [4, 5].

Since HWNs are constituted by heterogeneous nodes, paths in multipath routing of HWNs are charactered by selfish rationality. Each path intends to pursue individual profit, which may cause unreasonable competition for limited wireless bandwidth resources and lead to the unreliable data transport problem. Therefore, transmission optimization under the consideration of limited bandwidth is a challenge issue in multipath routing of HWNs.

Discovering the optimal path is a considerable approach in multipath routing discovery phase [5]. The shortest-path algorithm is no longer the best choice in the multipath routing mechanism, though it serves for the single path routing mechanism for a long time. Other than the shortest hop counts, several new routing metrics have been proposed, such as Expected Transmission Time (ETT) and a series of its extensions, including Expected Transmission Count Time (ETX), Weighted Cumulative ETT (WCETT), Multicast ETX (METX), Expected Packet Advancement (EPA) and Expected One-hop Throughput (EOT) [610]. In all of these metrics, the successful packet delivery ratio is the design basis, which affects the throughput directly. Meanwhile, due to the wireless nature and the dynamic characteristic of HWNs [1], links of HWNs are not such stable or reliable like that of the tradition wired network [11], i.e., the path reliability \(p_i \le 1, i\in N=\{1,2,\ldots ,n\}\). Moreover, the routing mechanism has to repeat the route discovery process once the current path in the transition work loses validity [4]. Therefore, the path reliability should be taken into account for an efficient routing mechanism.

As an important branch of game theory, stochastic differential game theory [12] has been used widely in the communication field [1317] to solve the dynamic optimal problem among multiple players. Players in the communication network can be nodes, links, clusters, etc. Employing the tool of game theory, transmission optimization issue of the multipath routing in HWNs can be transferred to utility maximization with the fairness of traffic distribution strategy guarantee.

This paper devotes to get an optimal traffic distribution strategy for the multipath routing of HWNs from the perspective of game theory. To solve the unreliable data transport problem resulting from unreasonable wireless bandwidth resource competition, a stochastic differential game model is constructed to maximize the transmission utility with the constraint of limited bandwidth resources, in which the path reliability is taken as an important factor.

The rest of this paper is organized as follows. Section 2 reviews the related methods of effective resource utilization. Section 3 details the stochastic differential game model for the multipath routing and obtains its solution. In Sect. 4, related multipath routing algorithms are designed. Section 5 provides the numerical simulation results. Finally, Sect. 6 concludes this paper.

2 Related Works

According to the challenge of the wireless resource scarcity, effective resource utilization can be realized by scheduling mechanism [18], cross-layer optimization [19], tradeoff design [20] and utility maximization [21]. The utility maximization method is a significant resource allocation topic, which can be used to deal with multiple flow control, QoS routing optimization and optimal pricing process in multipath routing. Considering multiple participants (i.e., paths) of multipath routing mechanism and the dynamic feature of HWNs, utility maximization is the most proper method to response to the challenge of the multipath routing in HWNs.

The early research work of utility maximization mainly focused on single-path routing, in which traffic distribution was not taken into consideration. As the increment of the data traffic, especially with the limited channel capacity in the wireless multimedia network [22], an efficient routing algorithm should distribute traffic into multiple paths to lighten the load on the single path with QoS guarantee.

The method proposed in [23] tried to distribute the traffic into multiple routes evenly, but it is unreasonable and not efficient because the reliabilities of routes are not the same generally in multiple paths. This method may lead to the resource waste of high reliable path and the overburden of low reliable path, and result in the low successful packet delivery ratio. In this paper, the traffic distribution strategy is based on the reliability of paths in HWNs.

Grosu et al. [24] formulated a static cooperative game model in distributed systems, which consisted by several heterogeneous computers. And the corresponding algorithm was detailed for solving the load balancing game. The method present in [24] realized fair load balancing scheme efficiently. However, the state of the distributed systems is usually dynamic, and the static premise of [24] lead to its limitation of practical applications. The noncooperative load balancing problem was researched in [25] subsequently, and it exist the same drawback without the consideration of the state variety as time went on. With the inconstant state consideration, this paper employs the stochastic differential game theory to study the dynamic traffic distribution problem in multipath routing scheme.

Xu et al. [14] presented an automatic load balancing scheme with the consideration of the co-channel interference in the game model for LTE networks and concluded that the differential game theory is applicable in wireless networks. Different from the LTE research area of [14], where the channel interference is the key factor in the game model, this paper studies the multipath routing of HWNs where path reliability is crucial for the model construction. Besides, the constraint of this paper considers the limited bandwidth resource in HWNs which was not discussed in [14].

The contributions of this paper are twofold. Firstly, a stochastic differential game model is constructed with the goal of utility maximization for each player in the multipath routing of HWNs. Secondly, the feedback Nash equilibrium solution is obtained and the numerical simulation results reveal the dynamics of traffic distribution scheme in HWNs.

3 Stochastic Differential Game and Solution

3.1 Problem Statement

To facilitate the study, the simple path-disjoint case is discussed in this paper, in which there is no common node or common link in the multipath routing, for choosing the optimal path set is a NP-complete problem [26]. An example of the multipath routing in HWNs is shown in Fig. 1.

Fig. 1
figure 1

An example of the multipath routing mechanism

In Fig. 1, the source node (SN) broadcasts Route REQuest (RREQ) messages in multipath routing discovery phase, and finds \(n\) paths are available from SN to Destination Node (DN) with the reception of Route REPly (RREP) messages. Here, \(n=3\), wireless link \(r_{1,1} ,r_{1,2} ,r_{1,3} ,r_{1,4} \) constitute valid path \(l_1 , r_{2,1} ,r_{2,2} ,r_{2,3} \) constitute path \(l_2 \), and \(r_{3,1} ,r_{3,2} ,r_{3,3} \) constitute path \(l_3 \). Path \(l_3 \) will take more traffic than path \(l_2 \) since \(p_3 >p_2 \), although path \(l_2 \) has the minimal hop counts and is the shortest path from SN to DN. According to [6], the reliability \(p_i \) of path \(l_i \) is related to the successful packet delivery ratio, which can be measured and calculated by using dedicated link probe packets.

In parallel multipath routing mechanism, \(n\) valid paths share the common transmission mission. Because paths constituted by heterogeneous nodes in HWNs are characterized by selfish rationality, each path desires to pursue its own profit by competing limited wireless resources, which leads to the unreliable data transport problem. Thus, multipath routing optimization is a challenge issue in HWNs.

Then, the problem of multipath routing optimization discussed in this paper can be described as following:

After routing discovery phase, SN holds \(n\) paths to DN. The load distribution begins at time \(t_0 \) and ends at time \(T\). At time \(s\), the traffic distributed into path \(l_i \) is \(u_i (s),i\in N=\{1,2,\ldots ,n\}\). Let \(U^{i}\) be the set of admissible traffic load, and \(x(s)\) be the load at time \(s\). Taking the rational considering, we have \(U^{i}\in R^{+}\) for \(x>0\), and \(U^{i}=\{0\}\) for \(x=0\). Then, the multipath routing optimization problem is to find the optimal strategy set \(\{\phi _i^*\left( {t,x} \right) \in \hbox {U}^{i}{;}i\in N\}\) for each path \(l_i \) involving in the transport mission, which guarantees the maximal profit for each path in the game process. To guarantee the fairness of the game, the path reliability \(p_i \) is the key factor to evaluate the allocation.

3.2 Stochastic Differential Game Model

By maximizing the individual profit and providing optimal behavior strategy among multiple game participants, noncooperative stochastic differential game theory can be used to conduct dynamic optimization for resource-limited HWNs.

In the game theory, players and their strategies are basic elements for a game process. Then, the stochastic differential game for multipath routing in HWNs can be described as follows:

  • Players: In the multipath routing, each path constituted by wireless links and heterogeneous nodes which are characterized by selfish rationality is a player of the game. Each player (i.e., path) desires to pursue its own profit by competing limited wireless resources, which leads to the unreliable data transport problem. For example, in Fig. 1, path 1,2,3 are game players.

  • Strategy: In the multipath routing, paths are mutual independent, and the transmission rates of data adopted by paths are strategies. The goal of this paper is to obtain the optimal rate allocation scheme by solving the game model constructed subsequently, with which each path’s can be satisfied by Nash Equilibria.

  • Profit: Paths gain their profit by transporting data packets. However, since the bandwidth resource is limited in HWNs, each path intends to pursue individual profit for the nature of selfish rationality, which may cause unreasonable competition and lead to the unreliable data transport problem. The main contribution of this paper is to obtain the greatest profit for each path (i.e., utility maximization) under the limitation of bandwidth.

Inspired by the contribution of [12] solving the economy problem in the resource extraction, a stochastic differential game model will be present for the multipath routing in this section.

Table 1 Table of symbols

The major mathematical notations introduced in this paper are summarized in Table 1, and the parameters they are notated will be simulated in Sect. 4 to analyze the load balancing performance.

Table 2 The parameter settings in the numerical simulation

Assuming the paths in the multipath routing satisfy the constraint conditions of Shannon channel capacity theory [27], the relationship between bandwidth \(B\) and capacity \(C\) can be described as

$$\begin{aligned} B=\frac{C}{\log _2 \left( {1+S/N} \right) } \end{aligned}$$
(1)

where \(S/N\) is the signal-to-noise ratio of the Gaussian channel.

The traffic distribution cost for path \(l_i \) can be defined as

$$\begin{aligned} P_i (s)=\psi Bu_i (s) \end{aligned}$$
(2)

where \(\psi \) is the cost factor.

Fig. 2
figure 2

The impacts of the path reliability \(p_i \) to \(\phi (t)\)

Fig. 3
figure 3

The impacts of \(\lambda \) to \(\phi (t)\) for each path

Let

$$\begin{aligned} \xi =\frac{\psi }{\log _2 (1+S/N)} \end{aligned}$$
(3)

Then

$$\begin{aligned} P_i (s)=\xi \sum _{k=1}^n {u_k (s)} u_i (s) \end{aligned}$$
(4)

Let \(q_i \) denote the packet loss probability of path \(l_i \), and

$$\begin{aligned} q_i =1-p_i \end{aligned}$$
(5)

Then, the unreliability cost of path \(l_i\) can be defined as

$$\begin{aligned} Q_i (s)=\delta q_i u_i (s) \end{aligned}$$
(6)

where \(\delta \) is the penalty factor.

At time \(T\), the terminal payment can be defined as

$$\begin{aligned} R_i (T)=\mu [x(T)-\bar{{x}}_i ] \end{aligned}$$
(7)

where \(\mu \) is the reward factor, \(\bar{{x}}_i\) is the threshold, and when \(x(T)>\bar{{x}}_i \), path \(l_i\) will be reward for enough packet delivery.

Fig. 4
figure 4

The impacts of \(\mu \) to \(\phi (t)\) for each path

Fig. 5
figure 5

The impacts of \(\xi \) to \(\phi (t)\) for each path

Fig. 6
figure 6

The impacts of \(\delta \) to \(\phi (t)\) for each path

Let \(\lambda \) denote the unit traffic profit for each path, and \(r\) denote the game discount rate. Then path \(l_i \) seeks to maximize the expected payoff

$$\begin{aligned}&E_{\mathrm{t}_0 } \left\{ \int _{t_0 }^T {[\lambda _i u_i (s)-\xi \sum _{k=1}^n {u_k (s)} u_i (s)-\delta q_i u_i (s)]} \exp [-r(t-t_0 )]ds\right. \nonumber \\&\left. \quad + \ \mu \exp [-r(T-t_0 )][x(T)-\bar{{x}}_i ]\right\} , \quad for\quad {i\in N} \end{aligned}$$
(8)

where \(E_{\mathrm{t}_0}\) denotes the expectation operator performed at time \(t_0 \).

Equation (8) subject to the traffic load dynamics

$$\begin{aligned} dx(s)=\{ax(s){+}\sum _{i=1}^n {[p_i u_i (s)} ]\}ds+\sigma x(s)dz(s),\quad {\begin{array}{l} {x(t_0 )=x_0 \in X} \\ \end{array} } \end{aligned}$$
(9)

where \(a\) and \(\sigma \) are traffic load adjustment constants, \(z(s)\) is a Wiener process and the initial state \(x_0 \) is given.

3.3 Feedback Nash Equilibria

According to [12], Theorem 1 for the stochastic differential game (8)–9) can be obtained.

Theorem 1

An \(n\)-tuple of feedback strategies for \(n\)-path traffic distribution \(\{\phi _i^*\left( {t,x} \right) \in \hbox {U}^{i}{;}i\in N\}\) provides a Nash equilibrium solution to the game (8)–9), if there exist suitably smooth functions \(V^{i}:\left[ {t_0 ,T} \right] \times R\rightarrow R,i\in N\), satisfying the semilinear parabolic partial differential equations

$$\begin{aligned}&-V_t^i (t,x)-\frac{1}{2}\sigma ^{2}x^{2}V_{xx}^i (t,x)\nonumber \\&\quad = \ \mathop {\max }\limits _{u_i \in U^{i}} \left\{ {\left[ {\lambda u_i -\xi \left( {\sum _{j=1,j\ne i}^n {\phi _j^*\left( {t,x} \right) } +u_i } \right) u_i -\delta q_i u_i } \right] \exp [-r(t-t_0 )]} \right. \nonumber \\&\qquad \left. +\ {V_x^i \left[ {ax(s){+}p_i \left( {\sum _{j=1,j\ne i}^n {\phi _j^*\left( {t,x} \right) } +u_i } \right) } \right] } \right\} \end{aligned}$$
(10)
$$\begin{aligned}&V^{i}\left( {T,x} \right) =\mu \exp [-r(T-t_0 )][x(T)-\bar{{x}}_i ] \end{aligned}$$
(11)
Fig. 7
figure 7

The impacts of \(T\) to \(\phi (t)\) for each path

Corollary 1

The traffic distribution of the multipath routing realizes Nash equilibrium, if the traffic distributed into path \(l_i \) has the following solution

$$\begin{aligned} \phi _i^*\left( {t,x} \right)&= \left[ {\xi \left( {n+1} \right) } \right] ^{-1}\left\{ \lambda +\delta \left( {\sum _{j=1,j\ne i}^n {q_j } {-}nq_i } \right) \right. \nonumber \\&\left. -\exp \left[ {r\left( {t-t_0 } \right) } \right] \left[ {\sum _{j=1,j\ne i}^n {p_j V_x^j } {-}np_i V_x^i } \right] \right\} \end{aligned}$$
(12)
Fig. 8
figure 8

The impacts of \(a\) and \(r\) to \(\phi (t)\) with different \(T\)

Proof

See “Appendix 1”. \(\square \)

Simplifying (12), and rewriting it, we get

$$\begin{aligned} \phi _i^*\left( {t,x} \right) =f_i +\exp \left[ {r\left( {t-t_0 } \right) } \right] \sum _{i=1}^n {g_i V_x^i } \end{aligned}$$
(13)

where \(f_i\) and \(g_i\) can be calculated by the constants \(\xi , \lambda , \delta , \left\{ {p_1 ,p_2 ,\ldots ,p_n } \right\} \) and \(\left\{ q_1 ,q_2 ,\ldots , q_n \right\} \).

Corollary 2

The game (8)–9) have a solution

$$\begin{aligned} V^{i}(t,x)=\hbox {exp}\left[ {-r\left( {t-t_0 } \right) } \right] \left[ {\Gamma (t)x+\Lambda (t)} \right] \end{aligned}$$
(14)

where \(\Gamma (t)\) and \(\Lambda (t)\) satisfy

$$\begin{aligned} \dot{\Gamma }(t)&= \left( {r+a} \right) \Gamma (t), \Gamma (T)=\mu ,\\ \dot{\Lambda }(t)&= r\Lambda (t)+\Theta (t), \Lambda (T)=-\mu \bar{{x}}_i , \end{aligned}$$

and

$$\begin{aligned} \Theta (t)=\left( {\lambda {-}\delta q_i } \right) \left( {f_i {+}\Gamma (t)\sum _{i=1}^n {g_i } } \right) -\left[ {\xi \left( {f_i {+}\Gamma (t)\sum _{i=1}^n {g_i } } \right) {+}\Gamma (t)} \right] \sum _{i=1}^n {\left( {f_i {+}\Gamma (t)\sum _{i=1}^n {g_i } } \right) } . \end{aligned}$$
(15)

Proof

See “Appendix 2”. \(\square \)

Solving the linear differential equation (15), we get

$$\begin{aligned} \Gamma (t)&= \hbox {exp}\left[ {\left( {r+a} \right) \left( {t-t_0 } \right) } \right] \overline{{\Gamma }},\\ \overline{{\Gamma }}&= \mu \hbox {exp}\left[ {{-}\left( {r+a} \right) \left( {T-t_0 } \right) } \right] ,\\ \Lambda (t)&= \exp \left[ {r\left( {t-t_0 } \right) } \right] \left( {\int _{t_0 }^t {\Theta (s)\cdot \exp \left[ {-r\left( {s-t_0 } \right) } \right] ds+\bar{{\Lambda }}} } \right) , \end{aligned}$$

and

$$\begin{aligned} \bar{{\Lambda }} = -\mu \bar{{x}}_i \hbox {exp}\left[ {-r\left( {T-t_0 } \right) } \right] -\int _{t_0 }^T {\Theta (s)\hbox {exp}\left[ {-r\left( {s-t_0 } \right) } \right] ds} . \end{aligned}$$
(16)

Derivation See “Appendix 3”.

Substituting the optimal load-control strategy (13) into (9) produces, we get

Corollary 3

The game (8)–(9) have an optimal state trajectory

$$\begin{aligned} x^{*}(s)&= \exp \left[ {\int _{t_0 }^t {ads} +\int _{t_0 }^t {\sigma dz(s)} } \right] \nonumber \\&\times \left\{ {x_0 +\sum _{i=1}^n {p_i \left( {f_i {+}\Gamma (t)\sum _{i=1}^n {g_i } } \right) } \exp \left[ {-\left( {\int _{t_0 }^t {ads} +\int _{t_0 }^t {\sigma dz(s)} } \right) } \right] } \right\} \end{aligned}$$
(17)

Proof

See “Appendix 4”. \(\square \)

4 Multipath Routing Algorithm of HWNs

At the routing discovery and establishment phase, SN obtains the quantity of the valid paths to DN with RREQ sending and RREP receiving processes. Probe packets are sent to acquire the reliability information \(p_i \) for each valid path. Then, with the stochastic differential game model we established in Sect. 3, the traffic distributed to path \(l_i \) is \(\phi _i^*\left( {t,x} \right) \).

We design Algorithm 1 for the routing discovery and establishment of the source node, and Algorithm 2 for the traffic distribution scheme based on the feedback Nash equilibrium solution in Sect. 3.

figure c
figure d

5 Simulation Results and Analysis

The numerical simulation is started on MATLAB 7.0.1. Considering 4 valid paths exit in the network from SN to DN, that is, \(i = 1,2,3,4\). The reliability of these paths \(p_i = \left[ {0.3,0.5,0.7,0.9} \right] \).

To check the impact of \(p_i \) to the traffic distribution, the parameters are set as Table 2. And the numerical simulation results are shown as Fig. 2. Here, we assume \(\xi \) has been calculated by Eq. (3). The simulation duration \(t\in [0,4]\), and \(T=3\).

In Fig. 2, the traffic of path-1 with the lowest reliability decreases with the simulation time \(t\), and delivers less packets than the other three paths with high reliability. The traffic of path-4 with highest reliability rises more abruptly than that of path-2 and path-3 with \(t\). The results verify the dynamics and efficiency of the proposed traffic distribution scheme. The more reliable path is supposed to take more delivery mission, and the unreliable path will be eliminated in the game procedure.

The impacts of the other parameters introduced in Table 1 are shown as Figs. 3, 4, 5, 6 and 7 with \(\lambda =\left[ {790,795,800,805} \right] \) in Fig. 3, \(\mu =\left[ {0,1,2,3} \right] \) in Fig. 4, \(\delta =\left[ {0.1,0.2,0.3,0.4} \right] \) in Fig. 5, \(\xi =\left[ {0.48,0.49,0.5,0.51} \right] \) in Fig. 6, \(T=\left[ {2,3,4,5} \right] \) in Fig. 7, respectively.

From Figs. 3, 4, 5, 6 and 7, the greater value of the simulation parameter, the less traffic distributed into path-1, while the more traffic distributed into the other three paths. Moreover, the most reliable path, i.e. path-3 delivers the most traffic loads. All these results are according to the design rationality. The more transmission profit, packets delivery reward, unreliability cost penalty, traffic distribution cost and the earlier terminal payment, result the feedback Nash equilibrium inclines to the more reliable one. All of these results reveal the method proposed in this paper realized dynamics of traffic distribution scheme in HWNs.

Fig. 9
figure 9

The comparisons of MRA_Nash and MRA_WD

Moreover, to analysis the impacts of \(a\) and \(r\) to \(\phi (t)\) at the specified time with different terminal time \(T\), we set \(t=3\) and \(T=\left[ {1,2,3,4} \right] \), and simulate the variation tendency of \(\phi (t)\) for path-4. The simulation results are shown in Fig. 8. When \(t>T\), shown as Fig. 8a, b, the traffic distributed into path-4 keeps rising. Comparing the results of Fig. 8a to that of Fig. 8b, the earlier terminal time, the more traffic are distributed into path-4, which is accordance with the results of Fig. 7. When \(t=T\), shown as Fig. 8c, that is, the simulation time equals to the terminal time of the traffic distribution process, \(\phi (t)\) maintains a constant value. When \(t<T\), shown as Fig. 8d, the traffic distributed into path-4 keeps dropping until the stable state is achieved.

The simulation results indicate that the impacts of the traffic load adjustment constant and the game discount rate to the traffic distribution are coincident. The earlier terminal time payment, the more traffic is distributed into the reliable path. And when the terminal time is exceeded, the load distribution achieves a stable state.

As the network throughput is measured by the number of successfully delivered packets per unit time, we compare the performance of the throughput of the proposed multipath routing algorithm based on the feedback Nash equilibrium solution (MRA_Nash) with the one based on the traffic well-distributed (MRA_WD). The comparison is shown in Fig. 9. From Fig. 9, we can see MRA_Nash outputs MRA_WD in the throughput performance, and the longer simulation time, the better throughput performance. When \(t\in [0,4]\), the advantage of MRA_Nash is not obviously. With the increase of time, and when \(t=6\), the number of successfully delivered packets of MRA_Nash is more than 15 % that of MRA_WD. Therefore, the multipath routing algorithm proposed in this paper improves the network throughput performance efficiently.

6 Conclusions

Using the stochastic differential game theory is a new trend to solve the communication problem recently. Based on the path reliability, this paper studies the multipath routing from the perspective of game theory. With the obtained feedback Nash equilibrium solution of the stochastic differential game model, the utility maximization is realized with the limited bandwidth constraint. The numerical simulation results show that the method proposed in this paper realizes the dynamic traffic distribution scheme for multipath routing of HWNs and improves the throughput performance efficiently. The method proposed in this paper can be used in time-critical application [28], where reliability is required strongly.