Keywords

1 Introduction

In a multihop wireless ad hoc network a node communicating with a remote node, located outside of its radio range, needs help from other, transit, nodes to relay its packets to the destination. As there is no infrastructure and centralized network administration, the nodes self-organize their networking. In a public, open network every node can be managed by a different entity which is selfishly interested in maximizing its network performance, while minimizing its cost. Additionally, some optimization algorithms may result in a selfish nodal behavior. In this environment it is easy to imagine a network which is under-performing or collapses because its users do not wish to cooperate with each other. In fact, research shows that selfishness is favoured in current network standards as it often results in a better network performance [1] at a cost of standard-compliant nodes which become selfless “suckers” and experience downgraded network service. Selfish strategy offers better performance in comparison to a compliant strategy. However, if this strategy is widespread in a network, everyone observes severely limited network performance which, in extreme cases, may result in a network collapse as few, or none, nodes are willing to cooperate. End-to-end packet forwarding cooperation is of primary focus in this work.

A prominent solution to selfish nodes problem is an implementation of a reputation system. The goal of such a system is to reliably detect nodal behavior and assign a reputation metric which will inform other nodes about past behavior of a particular node. This metric enables nodes across the network to decide how they should behave in future interactions with this node. In a typical reputation system, at least two subsystems can be identified: firstly, a reputation deduction subsystem, which translates nodal behavior to a reputation value, and secondly, a reputation response subsystem, which determines nodal response to other nodes’ reputation—typically in terms of misbehavior avoidance (better service levels) and misbehavior deterrence (minimizing the number of uncooperative nodes in the network). Section 2 surveys different concepts of using reputation in multihop wireless ad hoc networks.

Recently, two reputation deduction subsystem algorithms were published [2, 3] capable of producing accurate reputation values for multihop wireless ad hoc networks transit nodes, robust to packet losses and changes of nodal behavior. Furthermore, our preliminary research promise satisfactory reputation rendering capabilities of these algorithms in a mobile ad hoc network environment. The intent is to use the reputation metrics produced by either of the algorithms as an input to the reputation response subsystem outlined in this work.

This work presents a novel concept for using reputation metric to promote cooperation between selfish nodes by providing incentives for cooperative nodes and punishing misbehaving ones—thus fulfilling both “carrot” and “stick” parts of the approach. The purpose of this approach is to discourage nodes from misbehavior by making it to appear costly in contrast to the cooperation. In addition, the solution addresses a common problem of a reputation-driven routing which results in overloading well-behaving nodes and offloading misbehaving nodes, hence further boosting incentives for cooperation.

The remainder of this work is structured as follows: Sect. 2 surveys state of the art, Sect. 3 details the model, Sect. 4 outlines the Stick and Carrot approach, Sect. 5 discusses preliminary evaluation results, Sect. 6 concludes this paper.

2 Related Works

Reputation systems are one of the prominent concepts addressing the problem of extensive network nodes’ autonomy giving rise to nodal selfishness, causing a negative impact on multihop wireless ad hoc networks. A number of areas particularly susceptible to this kind of impact were identified, such as Quality of Service (QoS) [4] and the packet forwarding service [1]. Some works on reputation systems focus solely on assigning an accurate reputation value to each and every network node. Knowing others’ inclination to cooperation is a fundamental step in building a successful reputation system. However, it cannot be the only one as having the reputation of a particular value is not a self-contained goal and does not cause any implications to the subject of this rating, nor for holders of this information [5].

One, vital use of the reputation values produced by a reputation deduction subsystem is mitigation of misbehaving nodes, thus improving the level of service for nodes within the network. One, notable example of such use is a mechanism called Pathrater [6, 7], which selects paths avoiding misbehaving nodes. The misbehaving node in Pathrater’s context is a node that provides the packet forwarding service below a certain, predefined threshold. Other authors [8, 9], point out that routing around misbehaving nodes in fact rewards them as they are free from costs associated with forwarding others’ packets and can still have their packets delivered without obstructions. By analogy, this approach in fact punishes nodes of a good reputation with more transit traffic to forward. Additionally, confining transit nodes’ selection to a subset of nodes performing above a predefined threshold may result in no paths available for packets transmission, while there might be some under-performing, more costly options available. A similar algorithm, mitigating the problem of the limited transit nodes’ pool was presented in [10], where the best available path was selected—without lower limit of acceptable reputation. Both [9] and [10] recognize a problem of favouring the best performing nodes which result in lack of possibilities for low reputation nodes to improve their rating. [9] addresses this problem with a traffic shedding approach, which routes more traffic via nodes of low reputation and reduces the number of packets sent via cooperative nodes. This approach, however, impacts well-behaving nodes with poor network performance and offers no incentives for nodes to use this approach in a real-life scenario.

Another, equally important, use of reputation is punishment for misbehavior, where nodes of poor reputation receive limited service or, in some cases, their forwarding requests are entirely rejected. It is realised either by forwarding packets with a probability equal to the source node’s reputation [11] or a binary decision based on a predefined reputation threshold to entirely drop or forward all packets from a particular node (permanently or temporarily) [12]. The concept of rewarding or punishing nodes for their past behavior, by any node in a network, is called indirect reciprocity and originates from evolutionary biology and game theory [13]. A typical approach is to respond to node’s reputation in kind, providing the level of service equal to node’s previously observed forwarding behavior, as it is conveyed by the reputation value. This strategy is called tit-for-tat. In [13] a set of evolutionary best performing strategies, modeled as infinitely repeated games, were developed. Apart from indirect reciprocity response, it was found that the best strategies involve a punishment for inadequate response to the reputation. Similar findings were reported in [12].

In some systems [12, 14] a game theoretic approach is explicitly defined to model the rise of the misbehavior as well as to incentivize the nodes to cooperate. This approach clearly explains the origins of misbehavior in multihop wireless ad hoc networks and shows that the situation, without additional incentivization mechanisms, is equivalent to Prissoner’s Dillema and Tragedy of the Commons [15], where rational nodes are bound to misbehave. A solution to this problem can be found in game theory, which proves that a thoughtful design of the reputation response subsystem may effectively foster cooperation between the network nodes.

3 Model

In the presented model we use bold face to denote sets, capital letters for nodes and lowercase letters for various metrics. The model concerns a network of nodes from a finite set N, which can change over time as nodes join and leave the network. Our model concentrates on a packet forwarding service in a mulitihop wireless ad hoc network. In such a network, when a source node (S) wants to send a packet to a remote destination node (D), located outside of S’s radio range, it must request other, transit nodes to forward it’s packets to D. Transit nodes form a path, \(\mathbf{P} _{SD}\), accordingly to a selected routing algorithm, in order to fulfil the packet forwarding request. Nodes within the path retain a full decision autonomy, in particular, how to respond to each S’s forwarding request: forward it or drop it. Arguably, the node’s decision is made based on some rational premises. In a longer perspective, each transit node can forward all requested packets, drop some or all the packets. It is assumed that the nodal forwarding ratio, \(g_X\), called further nodal behavior, is an internally predefined number and the node is making an effort to provide the service as close as possible to \(g_X\). \(g_X\) is a real number in [0,1], where 0 represents a node dropping all packet forwarding requests and 1 a node forwarding every request. The \(g_X\) is considered a quasi-static number; it is expected that, in a well-designed system, \(g_X\) seldom change as the nodes pick a number which secures their equilibrium point, resulting in the best network performance for them (e.g. bandwidth, latency) at an acceptable price (e.g. bandwidth used, battery life). A source node, S, in such a network expects a path delivery ratio (PDR), \(p_{SD}\), equal to the product of transit nodes’ behaviors (1):

$$\begin{aligned} p_{SD}={\left\{ \begin{array}{ll} {\prod \limits _{X\in \mathbf{P} _{SD}}} g_X,&{}\mathbf{P} _{SD}\ne \varnothing \\ 1,&{}\mathbf{P} _{SD}=\varnothing \end{array}\right. } \end{aligned}$$
(1)

This model intentionally omits possible transmission errors as this is an intricate phenomenon without clearly defined relationships and deserves an in-depth study. In this approach the reputation system simply treats any event influencing (1) as a measurement error. [2] and [3] prove this approach to be successful in accurately rendering actual nodal behavior, \(g_X\), despite adverse conditions such ass, transmission errors and behavior changes. Furthermore, in a well-designed network, various transmission errors affect relatively small percentage of packets and a successful reputation deduction subsystem is expected to deduce \(g_X\) correctly despite the errors. Hence, the reputation response subsystem, defined in a higher abstraction layer, relatively to the reputation deduction subsystem and transport protocols, can assume that the error level is negligibly low.

The reputation deduction subsystem monitors the network and produces a reputation metric for each node X. We assume following for the reputation value:

  1. 1.

    Is a real number within [0,1] range, where 0 represents a complete absence of cooperation and 1 a completely cooperative node.

  2. 2.

    Is known to any other node in the network.

  3. 3.

    Closely resembles actual node’s behavior. Some discrepancies are permissible, however, in principle, they should be minor or temporary. Evaluation of the Carrot and Stick approach in an error-prone environment is deferred to the future.

We assume that a node X working in such a network is a rational node, which observes the network, its reputation and makes adjustments to its behavior, routing and other means under its control. An intent of this behavior is to maximize X’s network performance while reducing the costs of networking. Both gain and cost functions can render to be multifactor and complicated relationships. Arguably, both metrics can be simplified to a root cause which is a number of X’s packets successfully delivered in relation to the total number of packets sent, \(o_X\), for the network gain and a number of packets forwarded on behalf of other nodes, \(c_X\), for the cost. In this perspective, primary optimization problem for the node X becomes (2):

(2)

where: \(\overline{p_{X.}}\) denotes an average observed PDR for all paths originating from X. Performance maximization has a priority over cost minimization in this optimization problem.

Analysis of (2) may lead to a conclusion that a node which is not generating any network traffic has no motivation to behave cooperatively. However, it is important to note that it is unreasonable for such a node to be connected to a network if they do not plan to communicate with anyone as a mere cost of having the receiver enabled along with the cost of periodical exchange of network management traffic cannot be justified and becomes an expendable expense. All other nodes should maintain a reputation level which ensures an optimal \(o_X\) at all times, even for infrequent communication needs, as an opportunity for rebuilding reputation might not be readily available.

4 Carrot and Stick Approach

This section outlines high-level means of the Carrot and Stick approach and its use of network nodes’ reputation. A primary goal of a reputation response subsystem is to improve well-behaving nodes’ network performance while discouraging nodal misbehavior. Carrot and Stick approach addresses both parts by providing the best possible performance for fully cooperative nodes and a gradually deteriorating service for the ones who choose misbehavior.

Figure 1 depicts a complete reputation system operations and Carrot and Stick place in the behavior-perception-reputation-reaction feedback loop. The Carrot and Stick approach relies on reputation metrics calculated by a reputation deduction subsystem from observed PDR and, indirectly, \(g_X\). Based on these metrics all Carrot and Stick calculations are made. The method directly influences nodal behavior toward source nodes as well as network routing. It is expected that Carrot and Stick’s operations will affect rational nodes’ decisions and their behavior toward other nodes, prompting the subsequent cycle of the system.

Fig. 1.
figure 1

The graph depicts a behavior-perception-reputation-reaction feedback loop of the entire proposed reputation system. Carrot and Stick is a part of this system, using reputation to respond to nodal behavior. The nodes, experiencing Carrot and Stick response, react by changing their behavior to optimize their gain and cost (2).

A core component of the Carrot and Stick subsystem is an algorithm determining reciprocal reaction of a transit node to the reputation of the source node. The approach is inspired by the indirect reciprocity and tit-for-tat strategies, however Carrot and Stick responds to misbehavior with an enhanced severity, thus the response mechanism is called enhanced indirect reciprocity. A behavior of a transit node, Y, towards the source node, S is determined accordingly to (3):

$$\begin{aligned} b_{YS}={\left\{ \begin{array}{ll} g_S-(g_Y-g_S),&{}g_S < g_Y\\ g_Y,&{}g_S \geqslant g_Y \end{array}\right. } \end{aligned}$$
(3)

For the sake of simplicity, in (3) we assume that the reputation of a node is equal to its behavior, which is not always true. In an actual implementation the reputation value should be used, as \(g_X\) can rarely be directly observed by other nodes. Furthermore, a node can directly control its \(g_X\); its reputation, although closely correlated, is not available for direct manipulation by a node.

Since Eq. (3) changes nodal behavior, it influences the PDR expected by the source node. Hence, the formula for PDR calculation (1) transforms to (4):

$$\begin{aligned} p_{SD}={\left\{ \begin{array}{ll} {\prod \limits _{Y\in \mathbf{P} _{SD}}} b_{YS},&{}\mathbf{P} _{SD}\ne \varnothing \\ 1,&{}\mathbf{P} _{SD}=\varnothing \end{array}\right. } \end{aligned}$$
(4)

The second component of the Carrot and Stick approach is a Reputation-based Balanced Routing (RBR) which uses both information about the reputation of all the nodes and enhanced indirect reciprocity PDR estimations (4). The routing mechanism selects a route which promises the best possible PDR, maximum \(p_{SD}\), for a given source node communicating with a destination node. The specific implementation of the RBR algorithm is not defined to provide flexibility to choose best-performing algorithm suitable for a specific network. The RBR in this view is a route cost metric rather than a full-fledged routing algorithm.

Fig. 2.
figure 2

The diagram depicts Carrot and Stick’s use of reputation. Transit nodes build their response using source nodes’ reputation and (3). The source nodes experience transit nodes response through their observed PDR (4). In parallel, RBR-based routing mechanism, using publicly known reputation values and (3), selects transit nodes that promise the best PDR possible for a given source node, which results in observed PDR (4).

Figure 2 summarizes how Carrot and Stick use reputation to influence individual nodes’ gain (2) by interfering with their observed PDR (4). (3), determining transit nodes’ response to source nodes’ reputation, is the fundamental component of the entire response system as it directly impacts source nodes’ gains by forwarding or dropping their packets accordingly to enhanced indirect reciprocity concept. The second component of the system is the RBR. Such a routing is a rational step, complimentary to the enhanced indirect reciprocity, as it allows nodes to maximise their PDR and gain when operating in a network implementing (3).

Enhanced indirect reciprocity and RBR tandem is designed to provide the best possible network performance for cooperative nodes and deteriorating performance for misbehaving ones. Additional, feature of this approach is deflecting misbehaving nodes’ traffic from cooperative nodes which further improves cooperative nodes’ performance and incentivizes misbehaving nodes to improve their service (2).

It is assumed that the reputation deduction subsystem is monitoring how nodes adhere to Carrot and Stick principles and punish those who offer different service level than expected (3). It is expected that the reputation system will punish nodes offering both lower and higher level of service as this strategy was proven to be the most effective in [13]. Some tolerance for minor discrepancies is desirable.

The rational nodes by means of the optimization algorithm (2) and their knowledge about the networkFootnote 1, are expected to react to the Stick and Carrot operations and adjust their behavior to yield the best possible network performance. An in-depth evaluation of this dynamic and a study of optimal optimization strategies is deferred to future work.

5 Evaluation

This section presents results of a preliminary evaluation of the Carrot and Stick approach in numerical simulations in a custom-developed environment. The goal of this work is to validate if the approach is actually capable to reward cooperative nodes and punish misbehaving ones and if it is possible to incentivize misbehaving nodes to improve their behavior by offering better service level for cooperating nodes only. Two basic simulations were performed: the first one evaluates PDR observed by a source node of varying \(g_X\) interacting with one and later with three transit nodes of varying \(g_Y\); the second simulation evaluates PDR observed by randomly selected source nodes, communicating with random destinations, placed within a 100-node network.

Figure 3 presents a simple one source vs. one transit node interaction where a transit node Y is requested to forward source node X’s packets. The whole spectrum of possible \(g_X\) and \(g_Y\) values (with 0.01 step) were examined. It is clear that the fully cooperative X (\(g_X=1\)) can always expect the best possible forwarding service irrespective from Y’s behavior. X risks receiving no service at all from the well behaving Y if it chooses \(g_X \le 0.5\) (triangular plateau in the lower part of the plot). The lower the X’s behavior is the narrower becomes the pool of nodes available for cooperation with them and the lower becomes \(o_X\). In all cases, X achieves highest PDR, when interacting with Y, which \(g_Y=g_X\) (diagonal, falling “ridge” in middle of the plot). Additionally, Fig. 3 illustrates \(c_X\) in an interaction with X. The higher \(g_Y\) is the higher \(c_X\) becomes. The optimization problem (2) in this simple case clearly states that X should choose \(g_X\) = 1 as maximizing \(o_X\) has priority over minimizing \(c_X\).

Fig. 3.
figure 3

Behavior (\(b_{YX}\)) of a transit node Y towards a source node X, relative to both X (\(g_X\)) and Y (\(g_Y\)) behaviors.

In networks without the reputation response subsystem, missing a misbehavior punishment mechanism, analogous chart renders a flat, tilted surface where \(b_{YS}=g_Y\) and \(b_{YS}\) value is irrespective to \(g_X\). In such a network \(o_X\) depends solely on a probability of cooperating with Y node of a given \(g_Y\) value. A rational node in such a network following (2) algorithm will chose not to cooperate at all as this way it will maximize its \(o_X\) and minimize \(c_X\).

A simple tit-for-tat (\(b_{YS}=g_Sg_Y\)) response strategy yields a similar result to Carrot and Stick. The major differences are: nonlinear trajectory of \(b_{YS}\) and absence of discriminating bonus (3) for higher reputation nodes forwarding packets of relatively lower reputation nodes. In this network, the best X’s routing strategy, irrespective of \(g_X\), is to always seek highest available \(g_Y\). In such a scenario high-reputation nodes may become overwhelmed with congestive traffic (see Fig. 6), which reduces their \(o_X\) and they may decide to lower their behavior to improve their \(o_X\) and lower \(c_X\). This might be even more distinct in other, more elaborate optimization strategies, which take other performance metrics into account.

The node X relaying its packets via multiple transit nodes sees the PDR as a product of transit nodes’ behaviors (4). Figure 4 presents X’s interaction with three transit nodes, all of which have the same behavior \(g_Y\). Keeping the highest possible behavior is consistently the best strategy, ensuring the best possible \(o_X\) in all settings. With decreasing \(g_X\) or \(g_Y\) the source node experiences a steeper PDR degradation and the best strategy for a misbehaving node is, invariably, to choose transit nodes of similar behavior to \(g_X\), yielding \(p_k={g_X}^3\).

Fig. 4.
figure 4

PDR (\(p_k\)) observed by a node X of behavior \(g_X\) relaying its packet via three transit nodes, all of which have the same behavior, equal to \(g_Y\)

Another numeric experiment involved a network, consisting of 100 nodes, randomly placed on a 500 \(\times \) 500 m plain. Each node in this network had a constant 150 m. communication range and randomlyFootnote 2 assigned \(g_X\) value. A simple, rudimentary routing protocol, based on the RBR concept, was implemented, returning a path with the best available expected PDR (4) and of maximal length of 10 transit nodes. The same routing algorithm, picking a path of highest PDR for a given source node, was used in all presented cases. In such a network a 1000 random source-destination pairs were drawn and the routing algorithm returned the best available path between them. To ensure that the results are independent from a specific topology and \(g_X\) placement, each experiment’s setup was run 10 times, each with different, random nodes’ placement and behavior.

Figure 5 presents results of the simulations on the simplified networks. Figures present PDRs observed by the source nodes in relation to the source nodes’ \(g_X\). Each dot represents an average PDR observed by a single node. Direct connections, i.e. without transit nodes, were not taken into account. Figure 5a) plots results for networks with Carrot and Stick concept implemented (PDR calculated with (3) and (4)). PDR results for the popular tit-for-tat algorithm (\(b_{YX}=g_Xg_Y\)) are presented in Fig. 5b). The shape of both plots is similar, however the plot for Carrot and Stick is visibly steeper and levels at PDR=0 for \(g_X\) lower than 0.5. The reason for this, apart from the differences between algorithms themselves, is the distribution of \(g_X\) among the nodes in the simulation (normal distribution, favouring well-behaving nodes) which results in a relatively small number of significantly misbehaving nodes (\(g_X<0.5\)) to interact with each other. In tit-for-tat networks the best strategy for a misbehaving source node is to send its traffic via the best behaving nodes available, whereas in a network implementing enhanced indirect reciprocity, the best bet is to use similarly behaving transit nodes. Figure 5c) depicts observed PDR in networks where nodes does not react to the source node reputation and their \(b_{YX}=g_Y\).

Fig. 5.
figure 5

PDR observed by source nodes in relation to their behavior \(g_X\). Each dot represents an average PDR observed by a single node through the simulation. Chart a) presents results from networks implementing Carrot and Stick, b) networks operating accordingly to tit-for-tat approach and c) results from networks without any punishment mechanism.

Figure 6 depicts a total number of sessions where a node acted as a transit node in relation to its \(g_X\). Figure 6a) plots results for Carrot and Stick algorithm, Fig. 6b) for tit-for-tat and Fig. 6c) for networks without reciprocity algorithm implemented. The total number of sessions in each simulation was equal to one thousand (each plot present results from 10 identical runs of the simulation) and direct connections are not plotted here. The rudimentary routing algorithm in all experiments was picking the best path available. The most important difference in these results is a balance between how fully cooperative nodes are loaded in comparison to the rest of the nodes. The most heavily used cooperative nodes in networks where Carrot and Stick is implemented carry at least two times less traffic then their counterparts in networks with tit-for-tat and without any response algorithm implemented, where some nodes carry roughly half of the total traffic in the network (400–500 sessions out of 1000 total). Meanwhile, less cooperative nodes in both Fig. 6 b) and c) are almost unused serving singular sessions at most. The same nodes, in Carrot and Stick networks are used relatively more frequently, providing them with more opportunities to improve their reputation and offloading cooperative nodes. Note, a specific \(g_X\) distribution in all presented networks where cooperative nodes are more frequent. In networks of more uniform \(g_X\) distribution, this results may look different, however general regularities, outlined above, should be present in all cases.

Fig. 6.
figure 6

Plots depict a total number of communication sessions (1000 sessions in a single simulation) for each transit node in the network relatively to its \(g_X\). Chart a) presents results from a network implementing Carrot and Stick, b) a network operating accordingly to tit-for-tat approach and c) results from a network without any punishment mechanism.

In addition to the presented setups, Pathrater algorithm was evaluated along with all three reputation reciprocity algorithms, i.e. enhanced indirect reciprocity, tit-for-tat and no algorithm at all. Slightly lower average observed PDRs were recorded in all setups, with even more unbalanced traffic distribution, where nodes below Pathrater’s thresholdFootnote 3 were not used at all. In both enhanced indirect reciprocity and tit-for-tat, nodes of \(g_X\) below the threshold observed virtually no data delivered to their destinationsFootnote 4.

It is clear that with Carrot and Stick approach implemented misbehaving nodes receive results distinctively worse than the cooperative nodes. A node in such a network observes PDR \(\leqslant g_X\). In a network where no response mechanism is implemented every node, irrespective of their behavior, has similar chances to achieve the best possible PDR. With any reputation response subsystem and any \(g_X\) there are situations where the PDR is far from the maximum possible one for a given behavior as its impossible to ensure that there are only “optimally” cooperative transit nodes on the path. Arguably, an uncooperative node experiencing suboptimal network performance will decide to increase their \(g_X\), accordingly to (2). This way every cooperative nodes’ performance will improve. The system is successful in discouraging rational nodes, pursuing better performance via misbehavior, from decreasing their \(g_X\) and hurting overall network performance.

6 Conclusion

A concept of a novel approach, called Carrot and Stick, to using reputation in multihop wireless ad hoc networks was outlined. The main goal of this framework is to incentivize selfish network nodes to cooperate by rewarding cooperation and punishing misbehavior. The core mean of achieving this is a new algorithm, called enhanced indirect reciprocity—responding with a service level degraded accordingly to the source and transit nodes’ reputations and the difference between them. This approach, via Reputation-based Balanced Routing (RBR), influences routing decisions made by rational nodes seeking to optimize their network output, hence contributing to a balanced network traffic and mitigating cooperative nodes’ overload, further promoting the cooperative behavior.

Future research includes Carrot and Stick evaluation in an error-prone, simulated network environment and working in tandem with an operating reputation deduction subsystem. Performance, correctness and ability to incentivize cooperation in rational nodes – dynamically adjusting their behavior – in a real network environment are of primary interest in this upcoming work.