1 Introduction

There has been substantial interest in crowdsourcing and human-computation systems. These systems are based on mobilizing and utilizing people’s work in order to quickly and efficiently achieve certain tasks. Commercial offerings such as Gigwalk or Amazon’s Mechanical Turk allow users to submit tasks and recruit people to complete those tasks. Crowdsourcing is increasingly being used as the method of choice to obtain large-scale user data, such as environmental data, application traces, or to generate indoor-localization maps, e.g. [14, 17]. One key challenge in successfully deploying any such system is the question of how to incentivize people to actually perform tasks and contribute meaningfully. In fact, the same challenge is found in many other systems that rely on user contributions. For example, systems such as social forums, file-sharing services, public computing projects (e.g. SETI@Home), collaborative reference work, etc. suffer from the well-known network-effect bootstrapping problem. These systems can become self-sustaining when the scale of the participation list exceeds a certain threshold, but below this threshold, they may not provide sufficient inherent benefit for users to participate in.

One common type of incentive mechanisms for raising user participation in such systems are Incentive Trees. Incentive Trees are referral-based mechanisms in which (i) each participant is rewarded for contributing to the system, and (ii) a participant that has already joined the system can make referrals, and thereby solicit new participants to also join the system and contribute to it. The mechanism incentivizes such solicitations by making a solicitor’s reward depend on the contributions (and recursively also on their further solicitations, etc) made by such solicitees. Incentive Trees have been widely used in a variety of domains and under different names, e.g., in referral trees, multi-level marketing schemes, affiliate marketing or even in the form of the infamous illegal Pyramid Schemes. The question of how people can be incentivized using Incentive Trees to participate in crowdsourcing or network-effect systems is of significant interest and—starting from the work on Lottery Trees in [7], and most prominently through the work by the MIT team on the Red Balloon Challenge [13]—has recently attracted significant interest from the research community, e.g. [3, 9].

In this work, we study the foundations of Incentive Trees. An Incentive Tree Mechanism takes as input a weighted tree, where each node’s weight denotes its contribution to the system, and the tree structure reflects the solicitation history. Based on this input, the mechanism then computes a reward for each node in the tree in such a way that the sum of rewards is linear in the sum of contributions. The question is, how should this reward function look like? Ideally, an Incentive Tree Mechanism is constructed such that every participant is optimally incentivized to both (i) contribute to the system as much as possible, and (ii) solicit as many new and itself highly-contributing and highly-solicitating participants as possible. As we will see, simultaneously achieving both contribution and solicitation incentive is challenging, especially if the mechanism should satisfy additional properties, such as fairness or robustness to strategic behavior.

In this paper, we take an axiomatic approach. We define a set of basic, desirable properties which ideally an Incentive Tree Mechanism should satisfy. These include trivial properties such as the continuing solicitation and continuing contribution incentive properties, as well as more sophisticated properties that relate to the mechanisms resilience to strategic behavior. These are critically important. In web-based campaigns for example, resilience to multi-identity (Sybil [6]) attacks is key as it is often easy to forge identities by creating new free email accounts, and then “referring oneself” in order to get extra reward.

Results We study 8 desirable properties of Incentive Trees, that have also been studied in earlier work on Incentive Trees and multi-level marketing; and suitably generalize these properties to our new model. As it turns out, our new model necessitates fundamentally different algorithmic approach–the previously proposed mechanisms do not achieve an maximal set of desirable properties. We present two novel families of Incentive Tree reward mechanisms, both of which are based on algorithmic techniques previously unused in the literature on multi-level marketing or Incentive Trees. The first family of mechanisms achieves all desirable properties, except that it fails to protect against a certain strong form of Sybil attack (technically, it satisfies all properties except property Unprofitable Generalized Sybil Attack). The second family of mechanisms does yield protection against the strong form of Sybil attack, but fails to give participants the opportunity to achieve unbounded reward (technically, it satisfies all properties except Unbounded Reward Opportunity). Both mechanisms are resilient to the well-known multi-identity attacks discussed above. Finally, we show that under some mild assumptions, these two mechanisms are essentially the best we can hope for. Specifically, we give an impossibility result showing that no reward scheme can simultaneously achieve property Unprofitable Generalized Sybil Attack and Unbounded Reward Opportunity, while maintaining the other properties. Thus, our results imply that both of our mechanisms achieve a notion of optimality relative to the axiomatic properties we define in this paper: The mechanisms are optimal in the sense that they achieve a maximal mutually satisfiable subset of properties.

1.1 Related work

The two most closely related works are by Douceur and Moscibroda on Lottery Trees [7], and by Emek et al. on multi-level marketing schemes [9]. The former work is aimed at motivating people to participate in networked systems and bootstrapping such systems by network effect. The paper addresses the following question: Assuming that some system organizer is willing to spend a fixed amount of money incentivizing people to do a specific type of work, how should the system be organized to maximize the resulting work? The authors propose Lottery Trees, formalize a set of desirable properties, prove impossibility results, and devise two non-trivial mechanisms, one of which achieves near-optimality in terms of achieved desirable properties. However, there is a fundamental difference between our Incentive Tree model and the one in [7]. In our model, the total amount of reward distributed to the participants grows linearly in the total contribution (thus, it is a multi-level marketing-type model), whereas in [7], the total reward is a fixed, constant value. This difference significantly changes the achievable properties as well as the algorithmic design of the incentive mechanisms. Indeed, the optimal algorithm in  [7] (Pachira) is no longer optimal in our setting and cannot be easily adjusted (see Sect. 4.2).

The work by Emek et al. [9] has initiated the algorithmic study of multi-level marketing mechanisms. It proposes mechanisms for a model in which users can purchase items (specifically, each user can purchase one item of a fixed unit price). Participants join the system by buying a product, and can then refer friends to also buy this product. The paper proposes several properties of such unit-price multi-level marketing schemes and shows mechanisms that achieve a subset of these properties. The Incentive Tree model we study in this paper can directly be translated into the multi-level marketing context. When viewed in this context, our work is a substantially generalized version of the model in [9]: Participants correspond to buyers, and a participant’s contribution corresponds to the amount of goods purchased. The difference is that whereas in  [9], each buyer can only purchase a single item of unit price (i.e., each participant makes the same contribution to the system), in our model participants can make arbitrary contributions, i.e., each buyer can buy goods at arbitrary price. This generalized version of the problem yields a richer structure, and allows us to generalize the desirable properties in meaningful ways. The results in this paper directly apply to this generalized version of the multi-level marketing model. Moreover, as in the above case, the algorithmic structure of incentive mechanisms for the generalized model are substantially different than in the more simplistic case with single items of unit price.

In addition to these two works, there has recently been many other work on incentive systems. For example, Cebrian et al. [3] studies the Red Balloon Challenge [13] with split contracts and shows that in contrast to fixed-payment contracts, split contracts are robust to nodes’ selfishness. The Bitcoin system by Babaioff et al. [2] studies a problem similar to multi-level marketing. It uses a game-theoretic solution concept to study a problem in which agents are incentivized to forward sensitive information in such a way that the overall system performance is maximized. The work of Drucker and Fleischer [8] considers a multi-level marketing model with multi-items proving properties defined in [9]. Ghosh and McAfee [10] provide a game-theoretic model within which the design and performance of mechanisms for incentivizing high-quality user generated content can be analyzed. Other related work such as [4, 11] on query incentive networks and the corresponding efficient sybil-proof incentive mechanisms, Domingos and Richardson [5] on finding influential users in a social network, Anderson et al. [1] on influencing and steering user behavior, or Tennenholtz [15] on the effects of social structure on behavior and norms, is only loosely related to our work. Finally, incentive mechanisms have also been used in mobile systems to recruit people [14, 18]. Besides incentive-mechanisms, Sybil attack resilience has also been studied in other contexts, for example in voting [16].

2 Model

In our model, participants can join a system and contribute to it (e.g. by doing work such as finding weather balloons, uploading crowd-sourced data, solving tasks, etc). For a participant u, we denote its contribution by C(u), \(C(u)\ge 0\). Participants can also solicit new participants. Such referrals induce a referral forest F. Each participant is a node in F, and there is a directed edge (uv) between two participants u and v if u has joined the system in response to a solicitation by v. In other words, if u joins the system via a referral by v, it becomes a child-node of v in F. A new participant u who joins the system independently of any solicitation joins F as an independent node. For simplicity, we consider the equivalent referral-tree T, in which there is an imaginary root node r with contribution \(C(r)=0\), and all root-nodes in F are children of r. T is a weighted tree in which the weight of a node u is its contribution to the system C(u). We denote by \(C(T)=\sum _{u\in T}{C(u)}\) the total contribution in the system.

A reward mechanism is a function that takes as input the weighted referral tree T, and computes for each \(u\in T\) a non-negative real reward, denoted by R(u). Following  [9], we impose a budget constraint on this function: The system administrator is willing to spend no more than a certain fraction \(\varPhi \le 1\) of the total accumulated contribution on rewarding participants. That is, the upper bound of total reward \(R(T)=\sum _{u\in T}{R(u)}\) paid to participants grows linearly in the total contribution, i.e., \(R(T)\le \varPhi \cdot C(T)\). While in principle, any function satisfying these properties defines a possible reward mechanism, a well-functioning mechanism should maintain several desirable properties, which we define in Sect. 3.

Generalized multi-level marketing When viewed in the context of multi-level marketing, our model generalizes the model of Emek et al. [9], allowing buyers to purchase not just a single item of unit price or multi-items, but purchase items at arbitrary prices. Buyers can purchase goods from a seller. For some buyer u, her contribution to the system C(u) is the total cost of the goods purchased. The seller is willing to return a certain fraction of his total income in the form of rewards R(u) to the buyers. Notice that in this context, the amount of money a buyer u effectively ends up paying for the goods is his payment, \(Pay(u)=C(u)-R(u)\). And since a buyer’s reward can potentially exceed his cost (if he accumulates many contributing descendants), we also consider the profit as \(P(u)=R(u)-C(u)\).

Comparison to existing models The two main parameters in our model are contribution and reward. Many existing models have restrictions on either or both parameters. The Pachira in [7], Geometric Mechanism in  [9] as well as the winning strategy in the DARPA network challenge [13] require the total reward to be fixed. In [9, 13] the contribution of each node is the same, while in [7], contributions are allowed to be variable. In previous multilevel marketing models [8, 9], the total reward is linear in the total contribution, but the contribution (payment) of each node is fixed. We generalize and unify these models such that (i) each participant can make different contributions of arbitrary size, and ii) the total reward paid to participants is a linear fraction of the total system contribution. As we will see, these generalizations have major implications on the structure of the resulting incentive mechanisms. None of the existing incentive mechanisms performs well in the new model, or can be easily adjusted. The new model requires novel algorithmic techniques.

Tree notation We use standard tree notation. \(T_u\) denotes the subtree rooted at node u. \(p_T(u)\) denotes the parent of a node u in T. Finally, \(dep_p(u)\) denotes the depth of u in a tree \(T_p\), i.e., the distance between u and p. To simplify notation, we define \(dep_p(u)=-\infty \) if \(u\notin T_p\).

3 Desirable properties

In this section, we define the set of desirable properties that an Incentive Tree mechanism should ideally satisfy. All these properties are inspired by related properties defined in [7] for Lottery Trees; or in [9] for multi-level marketing; and they are adjusted appropriately to our new generalized model with arbitrary contributions.

3.1 Basic properties

Continuing contribution incentive (CCI) [7] A reward mechanism satisfies CCI if it provides a participant u with increasing reward in response to an increase of u’s contribution. This encourages participants to continue contributing to the system (e.g., to continue purchasing goods from the seller). Formally, given a referral tree T. If a node \(u\in T\) increases its contribution, \(C'(u)>C(u)\), and the contribution of all other nodes \(v\in T{\setminus } \{u\}\) remains the same, \(C'(v)=C(v)\), then the reward of u increases: \(R'(u)>R(u)\).

Continuing solicitation incentive (CSI) [7] A reward mechanism satisfies CSI if every participant always has an incentive to solicit new participants. This encourages ongoing solicitation and ensures continuing growth of the system. Let \(T_u\) and \(T'_u\) be the subtree rooted at u before and after a new participant has joined the system in u’s subtree. Then, \(R'(u)>R(u)\).

Reward proportional to contribution ( \(\phi \) -RPC) [7] This property suggests that a reward mechanism should maintain some basic notion of fairness among the participants, the degree of which is determined by the parameter \(\phi \). We say that a reward mechanism satisfies \(\phi \)-RPC for some \(0\le \phi \le 1\), if a participant u who contributes C(u), should at least receive a reward of \(R(u)\ge \phi C(u)\). In other words, every participant should receive at least a \(\phi \)-fraction of his contribution to the system. Note that we assume \(\phi \le \varPhi \) since otherwise no reward mechanism can satisfy the \(\phi \)-RPC property.

Unbounded reward opportunity (URO) [9] This property demands that there should be no limit to the reward a participant can potentially receive, even when his own contribution is fixed by constant. Formally, a reward mechanism satisfies URO if for every positive real R, C(u) and positive integer k, there exist k trees \(T_1,\ldots ,T_k\) attached to u in the referral tree such that \(R(u)\ge R\).

Profitable opportunity (PO) The PO property is a weaker version of URO. It suggests that a buyer with any positive contribution has the opportunity to get positive profit (reward minus contribution). Formally, a reward mechanism satisfies PO if for every positive real C(u) and positive integer k, there exist k trees \(T_1,\ldots ,T_k\) attached to u in the referral tree such that \(R(u)\ge C(u)\). A mechanism that satisfies URO satisfies PO.

Subtree locality (SL) [9] This property demands that the reward paid to a participant u is determined uniquely by its subtree \(T_u\), \(R(u)=f(T_u)\). The property ensures that each user is credited only for actions (contributions and solicitations) performed by itself, or its descendants. Violation of this property can have undesirable consequences. For example, the reward of a user could increase or decrease without him having taken any action (no new purchases or newly solicited buyers in his subtree). Note that as an important special case, the SL property subsumes the so-called Unprofitable Solicitor Bypassing (USB) property defined in [7]. This property demands that for a new participant, it should not matter where in the tree he joins, such that a new participant has no incentive to join the system as a child of someone other than his solicitor. Thus, the SL property prevents certain types of strategic behavior. Specifically, if a new participant has an incentive to join the system not as child of the participant that solicited him, then participants may altogether lose interest in soliciting new referrals.

3.2 Sybil-attack resilience properties

It is desirable that a reward mechanism is robust against strategic behavior by participants. In particular, we seek mechanisms that are resilient against multi-identity attacks, commonly known as Sybil-attacks [6]. A participant who is able to forge multiple identities (which is typically simple in web-based applications) should not be able to use this ability and “cheat” the mechanism for his own benefit. Previous work has defined two different definitions of Sybil resilience.

Unprofitable Sybil attack (USA) [7] This property is taken directly from [7], and it captures the classic notion of Sybil resilience. The USA property imposes that no participant can increase his profit purely by pretending to have multiple identities: A mechanism satisfies USA if a participant with a given contribution cannot increase his reward by joining the system as a set of Sybil nodes instead of joining as a single node. In other words, a participant who makes a certain contribution to the system should never have a benefit of “splitting” himself and its contribution up and making this contributions as two or more identities, even if these “Sybil identities” join the tree as if referring themselves.

Unprofitable generalized Sybil attack (UGSA) This property is strictly stronger than USA, and subsumes USA as a special case. It is a generalization of the so-called Profitable Sybil Attack or Split Proof property from [9], where it was defined for the restricted single-item multi-level marketing model. The property demands that a participant can never increase his profit by joining the tree as multiple identities, even if by doing so, he increases his contributions, i.e., purchases additional goods.

We can formally define USA and UGSA as follows. Given a tree \(T_0\). Let u be a participant that joins the tree. Let \(T'_1\) be the tree that results when u joins T as a single node. Alternatively, u can join the tree as a set of Sybil nodes \(S_u=\{u_1,\ldots ,u_k\}\), which can be arbitrarily connected in the referral tree. Let \(T''_1\) be the tree that results when u joins T as the Sybil node set \(S_u\). Let \(J=v_1,v_2,\ldots \) be an arbitrary sequence of new participants joining the tree, and let \(T'_1, T'_2,\ldots \) and \(T''_1, T''_2,\ldots \) be a sequence of trees resulting from these joins. Notice that in the case u joins as a set of Sybil nodes, there can be many different such sequences because any new child solicited by u can join as a child of any of the Sybil nodes \(u_1,\ldots ,u_k\). Finally, let \(R'_i(u), C'_i(u)\) be the reward and cost of u in \(T'_i\), and let \(R''_i(u)=\sum _{j=1,\ldots ,k}{R''_i(u_j)}, C''_i(u)=\sum _{j=1,\ldots ,k}{C''_i(u_j)}\) be the total reward and cost of u in \(T''_i\), respectively. We say that a reward mechanism satisfies USA if for any \(i>0\), \(R'_i(u) \ge R''_i(u)\), if \(C'_i(u) = C''_i(u)\). We say that a reward mechanism satisfies UGSA if for any \(i>0\), \(R'_i(u) - C'_i(u) \ge R''_i(u)- C''_i(u)\), if \(C'_i(u) \le C''_i(u)\). As mentioned, the UGSA property strictly subsumes the USA property by taking \(C'_i(u) = C''_i(u)\).

The difference between USA and UGSA is illustrated in Fig. 1. USA requires that a participant p who contributes a certain amount \(C_p\) be unable to increase his reward by joining as multiple identities \(p_1,p_2,\ldots \). Therefore, participant p in the right figure must receive at least as much reward as participant p in the middle figure. Notice that the total contribution of p remains the same; only the number of identities in the tree change. So, assume that the participant p starts out as shown in the right figure, with a contribution of 2 and a single identity. Then, he decides to split up into two identities of contribution 1 each. An incentive mechanism satisfies USA, if this splitting up does not benefit p.

UGSA is a stronger property: It additionally demands that p’s profit (=reward-cost) in the middle figure cannot exceed his profit in the left figure. For example, assume that the participant p starts out as shown in the left figure, with a contribution of 1 and a single identity. Then, he decides to split up and increase its contribution to \(1+1=2\). An incentive mechanism satisfies UGSA, if this identity splitting does not benefit p.

It is interesting to discuss the relative importance of these properties from the point of view of the system administrator or the seller in a multi-level marketing context. USA is clearly a desirable property from his point of view because if USA is violated, he will simply pay too much reward for no additional contribution. The case of UGSA may be less obvious; and its importance depends on the specific circumstances. In particular, it is possible that UGSA is violated even though the seller does not actually lose money (i.e., if the contribution exceeds the reward). This is possible if the Sybil buyer p increases his contribution not at the cost of the system administrator, but at the cost of other participants in the system, for instance the parent of p. However, if this is the case—i.e., UGSA is violated not at the cost of the seller, but at the cost of some ancestors of p—then there is nuanced breakdown of the incentive structure: If nodes may profit from Sybils at the expense of their referrers, the referrer is not incentivized to recruit more nodes. Thus, UGSA is a desirable property to maintain, even if a violation may not always be at the cost of the seller.

Fig. 1
figure 1

Participant p joining (left) as a single node with cost 1; (middle) as two Sybil nodes that refer one another, each with cost 1; and (right) as a single node with cost 2

When discussing our TDRM mechanism (end of Sect. 5), we will give a concrete example of TDRM violating UGSA.

4 Existing Incentive Tree mechanisms and impossibility result

In this section, we briefly review existing (multi-level marketing and Incentive Tree) algorithms and analyze which desirable properties they achieve. Then observing that each existing algorithm can only achieve a subset of desirable properties, we give an impossibility proof showing that there can be no reward mechanism that simultaneously satisfies PO and UGSA.

4.1 Geometric mechanism

The simple geometric reward mechanism is commonly used, e.g. in [13]. The idea is that a certain fraction a of a node’s contribution “bubbles-up” to its parent, a fraction \(a^2\) bubbles up to its grand-parents, etc. Given two constants \(0<a<1\) and \(b\ge \phi \) such that \(b\le (1-a)\varPhi \), the reward of a participant u in the \((a,b)-geometric\) mechanism is defined as follows.

figure a

The condition \(b\le (1-a)\varPhi \) is to ensure the budget constraint. Specifically, the total reward that a node u is responsible for is at most \(b\frac{1}{1-a}C(u)\), which should be less than \(\varPhi C(u)\). The fairness property \(\phi -RPC\) is satisfied if we also set \(b\ge \phi \). It is easy to derive the following theorem.

Theorem 1

The (ab)-Geometric Mechanism with \(\phi \le b\le (1-a)\varPhi \) achieves all desirable properties, except USA and UGSA.

The reason why USA (and thus, UGSA) is violated is also easy to see. A node can increase his reward by splitting itself into multiple Sybil nodes that are linked to each other as a chain. Some of the “bubbled-up” reward is then handed to other Sybil nodes of u and the total sum of rewards accumulated by u is larger than if u joins as a single node.

4.2 Multi-level marketing mechanisms derived from Incentive Tree mechanisms

In [7], two Incentive Tree mechanisms are given (called Luxor and Pachira) for a model in which the total reward in the system is a fixed constant. Any such Incentive Tree Mechanism A for the fixed total reward model can be transformed into an Incentive Tree Mechanism L-A in our model by simply multiplying the reward paid to a user u by a factor of \(\varPhi C(T)\) (assuming that the total reward is normalized to 1). Applying this transformation to Luxor and Pachira yields two mechanisms L-Luxor and L-Pachira. As it turns out, L-Luxor is very similar to the (ab)-Geometric Mechanism, and achieves the same properties. On the other hand, L-Pachira is interesting. For two parameters \(0\le \beta \le 1\) and \(\delta >0\), the \((\beta ,\delta )\)-L-Pachira Mechanism is defined as follows. The main technique for Pachira to achieve USA is to utilize the concave function \(\pi (x)\), that is, according to Jensen’s Inequality, the splitting will decrease a participant’s reward.

figure b

It was shown in [7] that Pachira achieves USA, and the same proof carries over to L-Pachira as well. Moreover, \(\phi -RPC\) can be satisfied by setting \(\beta \ge \phi /\varPhi \). Pachira does not satisfy the CSI property in the Incentive Tree model. But when transforming it into the multi-level marketing model, L-Pachira does achieve CSI, although the fact is not straight-forward. On the other hand, it is easy to see that L-Pachira fails to satisfy the SL constraint, because of its dependency on the total system contribution C(T).

Theorem 2

The \((\beta ,\delta )\)-L-Pachira Mechanism with \(\beta \ge \phi /\varPhi \) achieves all desirable properties, except SL and UGSA.

4.3 Split-proof mechanism

For the single-item multi-level marketing model studied in [9], Emek et al. give a mechanism that achieves several properties, including the single-item model equivalent of UGSA and URO. This algorithm is based on the idea of computing a deepest binary subtree of the referral tree and then computing the rewards based on that subtree. Unfortunately, this fails the basic CSI property because depending on the number of direct children it has, a node may no longer have an incentive to directly solicit additional children.

4.4 Impossibility result

The subsequent constructions of our two new mechanisms are motivated by the following impossibility result, which suggests that if a mechanism satisfies the SL property, then UGSA and PO (and thus URO) are mutually incompatible. Since SL is a fundamental property, this result motivates our search for (i) a mechanism that achieves all the properties except UGSA (Sect. 5) and (ii) a mechanism that achieves all the properties except PO/URO (Sect. 6).

Theorem 3

There is no Incentive Tree mechanism that can simultaneously achieve SL, PO and UGSA.

Fig. 2
figure 2

Illustration of notation used in the proof

Proof

We prove the theorem by contradiction. Suppose a mechanism A can achieve SL, PO and UGSA. In the following proof, all reward computations are done using mechanism A.

Consider a node \(v^*\) with \(C(v^*)>0\). According to PO, there exists a case in which \(v^*\) has one child tree, and yet \(v^*\)’s profit is positive, \(P(v^*)=R(v^*)-C(v^*)>0\). We denote the child tree as \(T^*\) and its root as \(u^*\). Suppose the contribution of \(u^*\) is \(C(u^*)\) and \(T^*{\setminus }\{u^*\}\) forms a set of subtrees denoted as \(T_1,\ldots ,T_k\). According to SL, \(R(v^*)\) only depends on \(C(v^*)\) and \(T^*\). We compare two cases. The first case is exactly as described above (Fig. 2, left). The profit of \(u^*\) is \(P(u^*)=R(u^*)-C(u^*)\). In the second case (Fig. 2, right), node \(u^*\) launches a (generalized) Sybil attack by joining the referral tree as two nodes \(u_a\) and \(u_b\) with \(C(u_a)=C(v^*)\) and \(C(u_b)=C(u^*)\). Notice that the Sybil attack is generalized (i.e., of the USGA-type), since the total contribution of \(u_a\) and \(u_b\) exceeds the contribution of \(u^*\). Further notice that in the second case, the root of \(v^*\)’s descendant tree is \(u_a\); \(u_a\) is \(u_b\)’s parent; and \(u_b\) is the parent of \(T_1,\ldots ,T_k\), i.e., we keep every node in \(T^*\) unchanged except \(u^*\).

According to SL, it must hold that \(u_a\) has the same reward as \(v^*\) (with \(T^*\) attached to it), and for the same reason, \(u_b\) must have the same reward as \(u^*\). Specifically, it holds that \(R(u_a)=R(v^*)\) and \(R(u_b)=R(u^*)\). The total profit of \(u^*\)’s two Sybil nodes \(u_a\) and \(u_b\) is thus \(P'(u^*)=R(u_a)+R(u_b)-C(u_a)-C(u_b)=(R(v^*)-C(v^*))+(R(u^*)-C(u^*))>P(u^*)\). This implies that \(u^*\) can get more profit by contributing more, which violates UGSA. \(\square \)

In the following context, as the main technical contribution of this paper, we present two novel reward mechanisms, both of which achieve a maximal subset of mutually satisfiable properties. The mechanism in Sect. 5 achieves all properties except UGSA, and the mechanism in Sect. 6 achieves all properties except URO/PO.

5 Satisfying all but UGSA: topology-dependent reward mechanisms (TDRM)

We construct the mechanism in two steps. We first give an intermediate mechanism which manages to satisfy USA, but does not satisfy budget constraint. This preliminary form of the mechanism could be turned into a feasible reward mechanism that satisfies the budget constraint, but doing so would violate subtree locality (SL). We then show how we can eliminate the shortcomings of this preliminary mechanism in such a way that both budget constraint and SL are satisfied.

As we discussed in the previous section, the reason why the simple Geometric Mechanism fails the USA property is that it is beneficial for a node to split up and accumulate its own “bubbled up” rewards. This can be avoided by changing the linear dependency of a node’s reward on its own and other node’s contribution to a dependency that is of quadratic nature. Specifically, when computing the reward of a participant u, we multiply u’s contribution by the contribution of every node in u’s subtree, including itself. In this way, even though u could still accumulate “bubbled-up” rewards from its own Sybil nodes, we can show that it is always beneficial for u to focus its total contribution in a single node. The resulting mechanism works as follows.

figure c

The problem is that while the structure of this quadratic geometric reward mechanism is such that it achieves USA, it is not in fact a feasible mechanism: It fails the budget constraint. On the positive, its structure is such that it does achieve USA. To see why, consider a node u. Suppose u can benefit from splitting itself into a set of Sybil nodes \(u_1,\ldots ,u_k\), such that \(C(u)=\sum _{i=1..k}{C(u_i)}\). We can re-write the reward of u if it remains a single node as

$$\begin{aligned} R(u)= C(u)^2 + C(u) \sum _{v\in T_u{\setminus }{u}}{a^{dep_u(v)}\cdot b\cdot C(v)}. \end{aligned}$$

If it splits itself into Sybil nodes, its new reward is at most

$$\begin{aligned} R'(u)\le & {} [C(u_1)+\cdots + C(u_k)]\cdot \sum _{v\in T_u{\setminus }{u}}{a^{dep_u(v)}\cdot b\cdot C(v)}\\&+\,(C(u_1)+ \cdots + C(u_k))^2, \end{aligned}$$

because the distance between any descendant \(v\in T_u{\setminus } u\) to any of the Sybil nodes \(u_i\) is at least as large as the original distance between u and v in T. Comparing the two expressions, it can be seen that splitting u into multiple nodes \(u_1,\ldots ,u_k\) does neither increase the first summand (because of the quadratic term), nor the second.

The fundamental problem with this approach is that in order to stay within budget, we would need to scale down the rewards R(u) that are distributed to the participants. However, the amount by which we would need to scale would depend on a global property of the referral tree, for example C(T). Thus, such a scaling would fundamentally violate the SL property. In order to overcome this problem, we would like to constrain the reward a node can obtain. This will allow us to meet the budget constraint by scaling each node’s reward by a constant factor, independent of C(T). This could easily be achieved if there was a constant upper bound \(\mu \) on the contribution C(u) of every node \(u\in T\). However, since our model allows a participant to potentially have an unlimited contribution, our mechanism simulates such an upper bound \(\mu \) by splitting each participant with contribution exceeding \(\mu \) into a set of nodes, each with contribution at most \(\mu \). The mechanism then computes the rewards in the resulting Reward Computation Tree (RCT), which may differ from the referral tree. In fact, one user can correspond to multiple nodes in the RCT. A participant’s final reward is the sum of the rewards of his corresponding nodes in the RCT.

The effect of computing the rewards in the Reward Computation Tree in this way is that for participants with very large contribution, the algorithm effectively linearizes this node’s reward with regard to its contribution. In the process, we need to be careful about not violating the USA property. Specifically, in order to make sure that this linearization does not thwart the USA-achieving structure of the quadratic reward computation, the mechanism must be careful about the way it splits participants with large contribution. In particular, our mechanism ensures that for any such split, it is the best possible split for such a participant. In other words, even though the splitting effectively reduces the reward of very large contributors (compared to the preliminary quadratic TDRM mechanism), participants can nevertheless not benefit from a Sybil attack, because they are already given the best possible split.

Fig. 3
figure 3

Transformation of a referral tree T into a reward computation tree \(T'\) by TDRM

The TDRM mechanism works as follows. Given four parameters \(\lambda <\varPhi -\phi \), \(\mu >0\), a and b, such that \(a+b<1\), TDRM first transforms the referral tree T into a reward computation tree \(T'\), and then computes the rewards on \(T'\). We denote by C(u) and \(C'(u)\) the contributions of a node u in T and \(T'\), respectively. For a participant \(u\in T\), we define a chain \(CH_u\) of length \(N_u\) in \(T'\) as a sequence of nodes \(m^u_1,\ldots ,m^u_{N_u}\), such that \(m^u_i\) is the parent node of \(m^u_{i+1}\), for all \(i=1...N_u-1\). We call \(m^u_1\) and \(m^u_{N_u}\) the head and the tail of the chain, respectively.

figure d

Figure 3 gives an example of how the mechanism transforms the referral tree T (left) into a corresponding reward computation tree \(T'\) (right). After this transformation, TDRM first computes the rewards for each node in \(T'\) according a function similar to the one given in the preliminary TDRM mechanism. Finally, the reward of a participant \(u\in T\) is computed as the sum of all the nodes in the corresponding chain \(CH_u\) in \(T'\). It remains to show that the mechanism meets the budget constraint—we do this in the next section. With this, we can prove the following key theorem.

Theorem 4

The TDRM mechanism with parameters \(\lambda <\varPhi -\phi \), \(b<1-a\), and \(\mu >0\) achieves all desirable properties except UGSA.

Proof idea The full version of the proof is in the appendix. Here are some intuitions. At the heart of our proof is that TDRM satisfies USA. To do so, we define an \(\epsilon \hbox {-}chain\) as a chain in the reward computation tree of which only the head node can have contribution less than \(\mu \). Then consider the set of optimal partitions for u in the reward computation tree (partitions maximizing R(u)). We show that at least one optimal partition has the structure of a single \(\epsilon \hbox {-}chain\) in the RCT. In other words, we show that u’s best possible Sybil attack is to join in such a way that the resulting structure in the RCT is an \(\epsilon \hbox {-}chain\). However, since the TDRM mechanism transforms u into an \(\epsilon \hbox {-}chain\) in the RCT even if u joins as a single node, it follows that u has no benefit of joining the referral tree as multiple Sybil identities. The mechanism itself will give u the best possible split, thus giving u no incentive to split itself.

Example

To show that TDRM does indeed violate UGSA, consider the following counter-example. Let u be a participant with \(C(u)=\frac{1}{2}\mu \) and let \(v_1,\ldots ,v_k\) be u’s children with \(C(v_1)=\cdots =C(v_k)=\mu \) (\(k>\frac{1}{ab\lambda }\)). The profit of u as computed by TDRM is \(P(u)=\frac{1}{2}((ak+1)\lambda \mu b+\phi \mu -\mu )\). If we increase u’s contribution to \(C'(u)=\mu \), then we can show that the new profit of u is \(P'(u)=R'(u)-C'(u)=(ak+1)\lambda \mu b+\phi \mu -\mu \), which is larger than P(u). That is, by increasing his contribution u can increase his profit, which violates UGSA.

6 Satisfying all but URO: contribution-deterministic reward mechanisms

Given the impossibility results in Theorem 3, we cannot expect to achieve a mechanism that achieves all the desirable properties defined in this paper, in particular, we cannot hope to simultaneously achieve UGSA and URO. The TDRM mechanism in the previous section has achieved all, but UGSA. In this section, we show that we can also relax the other property, URO, and satisfy instead all the remaining properties. For this, however, entirely different algorithmic techniques are required.

The key idea is that whereas the previously discussed mechanisms are topology-dependent (i.e., the reward is among other things a function of the structural property of a node’s descendant tree), we now consider mechanisms in which the reward of a participant u is independent of the topology of its subtree. In particular, we seek mechanisms in which the reward R(u) is purely a function of u’s own contribution and the sum \(\sum _{v\in T_u}{C(v)}\) of the contributions in \(T_u\). We show that this can yield a family of mechanisms that achieve UGSA, albeit at the cost of URO.

For ease of notation, define \(x_p=C(p)\) and \(y_p=C(T_p\backslash \{p\})\) for a participant \(p \in T\). Then, we want that the reward function R(p) is purely a function of \(x_p\) and \(y_p\). What properties should this function \(R(x_p,y_p)\) have in order to satisfy the desirable properties? The SL constraint is automatically satisfied by the definition of \(R(x_p,y_p)\). The CCI property demands that \(R(x_p,y_p)\) is increasing in \(x_p\), i.e. \(0<\frac{dR(x_p,y_p)}{dx_p}\). In order to satisfy CSI, it should hold that an increase in \(y_p\) increases p’s reward, hence \(0<\frac{dR(x_p,y_p)}{dy_p}\). If we want to globally ensure the budget constraint, one way to do this is to demand that \(R(x_p,y_p)<\varPhi x_p\), and similarly, the \(\varphi \hbox {-}RPC\) property can be enforced by \(\phi x_p<R(x_p,y_p)\). It is important to point out that demanding the budget constraint to be satisfied by means of \(R(x_p,y_p)<\varPhi x_p\) implies that we cannot achieve the unbounded reward property URO. The reason is that if URO were to be satisfied, \(R(x_p,y_p)\) would need to be able to grow larger and larger as \(y_p\) increases, which would eventually violate this constraint. In order to also achieve USA, we need the condition that for any \(x'_p\), \(x''_p\) such that \(x'_p+x''_p=x_p\), it holds that \(R(x_p,y_p)\ge R(x'_p,x''_p+y_p)+R(x''_p,y_p)\), and, finally, in order to achieve UGSA (under the assumption that we already have USA satisfied), we only need \(\frac{dR(x_p,y_p)}{dx_p}<1\).

Combining these observations, we can demand that a function \(R(x_p,y_p)\) satisfies four properties. If it satisfies all of them, we call the function successfully contribution-deterministic. The properties are, for any \(x_p>0\), \(y_p\):

$$\begin{aligned}&\mathrm{(i)}\; 0<\frac{dR(x_p,y_p)}{dx_p}<1, \quad \mathrm{(ii)}\; 0<\frac{dR(x_p,y_p)}{dy_p},\\&\mathrm{(iii)}\; \phi x_p<R(x_p,y_p)<\varPhi x_p,\\&\mathrm{(iv)}\; R(x_p,y_p)\ge R(x'_p,x''_p+y_p)+R(x''_p,y_p), \end{aligned}$$

for any \(x'_p\), \(x''_p\) such that \(x'_p+x''_p=x_p\).

Theorem 5

If \(R(x_p,y_p)\) is a successfully contribution-deterministic function, then the reward mechanism that distributes rewards according to \(R(x_p,y_p)\) achieves all properties, except URO.

Proof

The proof follows closely along the lines of how the properties are defined. The SL constraint is obviously satisfied. CCI is satisfied because \(R(x_p,y_p)\) is increasing in \(x_p\) (Property i); CSI is satisfied because \(R(x_p,y_p)\) is increasing in \(y_p\) (Property ii); and both \(\phi \hbox {-}PPC\) and the budget constraint are clearly satisfied because of Property iii.

We prove that USA is satisfied by contradiction. Suppose there is a participant p that can maximize his reward by joining the system as \(k\ge 2\) nodes, and assume that the cardinality k is minimal among all those maximal splits. Consider two of these Sybil nodes \(p_1\) and \(p_2\), and define \(x_1=C(p_1)\), \(x_2=C(p_2)\), \(y_1=C(T_{p_1})-C(p_1)\) and \(y_2=C(T_{p_2})-C(p_2)\). There are two cases:

(a) \(p_1\) is an ancestor of \(p_2\) (or vice versa). Then we know that \(y_1\ge x_2+y_2\), \(0<\frac{dR(x_p,y_p)}{dy_p}\), so for any \(x_p\) and \(y_p\),

$$\begin{aligned} R(x_1, y_1)+R(x_2,y_2)\le R(x_1,y_1)+R(x_2,y_1-x_2). \end{aligned}$$

According to Property iv defined above, we know that

$$\begin{aligned} R(x_1,y_1)+R(x_2,y_1-x_2)\le R(x_1+x_2,y_1-x_2). \end{aligned}$$

Combining these two expressions implies that the following inequality holds:

$$\begin{aligned} R(x_1, y_1)+R(x_2,y_2)\le R(x_1+x_2,y_1-x_2). \end{aligned}$$

This means that p can get at least the same reward by merging \(p_1\) and \(p_2\) into one node, which contradicts our assumption.

(b) \(p_1\) is not an ancestor of \(p_2\) (or vice versa). According to Property iv, it holds that

$$\begin{aligned}&R(x_1+x_2, y_1+y_2)\ge R(x_1,y_1+y_2+x_2)\\&\quad +\,R(x_2,y_1+y_2)>R(x_1,y_1)+R(x_2,y_2). \end{aligned}$$

Like in case (a), this implies that p can get at least the same reward by merging \(p_1\) and \(p_2\) which contradicts our assumption. This concludes the proof that USA is satisfied.

Finally, we prove that UGSA is satisfied. Consider some participant p. We need to compare two cases. In the first case, p joins the system as k nodes, \(p_1,\ldots ,p_k\). In the second case, p joins the system as a single node. In order to prove UGSA, we need to show that for any k and any \(\Sigma ^{k}_{i=1}C(p_i)\) which is equal to or larger than C(p), in the second case, p can get higher profit, namely \(\Sigma ^{k}_{i=1}(C(p_i)-R(C(p_i),C(T_{p_i}\backslash p_i)))\ge C(p)-R(C(p),C(T_{p}\backslash p))\). According to the USA property, we know that any participant p with a fixed cost can get the highest reward by joining the system as single node. Therefore, we can assume that there is an optimal choice in the scenario in which \(k=1\).

It remains to prove that for any \(\epsilon >0\), it holds \(x_p-R(x_p,y_p)<x_p+\epsilon -R(x_p+\epsilon ,y_p)\). According to Property i, we know that for any \(x_p\), \(y_p\),

$$\begin{aligned} \frac{dR(x_p,y_p)}{dx_p}<1. \end{aligned}$$

Therefore, it follows that for any \(\epsilon >0\),

$$\begin{aligned}&R(x_p+\epsilon ,y_p)-R(x_p,y_p)<\epsilon \\&\quad \Rightarrow x_p-R(x_p,y_p)<x_p+\epsilon -R(x_p+\epsilon ,y_p). \end{aligned}$$

As \(\epsilon >0\), the total profit decreases, which implies that UGSA is satisfied. \(\square \)

6.1 CDRM mechanisms

The properties derived in the previous section imply a family of reward mechanisms all of which achieve all properties except URO. It remains to find specific, practical functions that belong to this family. In this section, we give two examples. First, we set \(R(x_p,y_p)=f(x_p,y_p)x_p\), so that the reward function is proportional to \(x_p\).

figure e

In both cases, it is easy to verify that the reward function does satisfy all the properties stated in the theorem. Hence, both CDRM mechanisms satisfy all our desirable properties, except URO.

7 Conclusions

In this work, we have studied Incentive Tree mechanisms, thus formalizing and generalizing previous algorithmic work on Referral Trees, Lottery Trees [7, 13] and multi-level marketing mechanisms [8, 9]. We design two families of Incentive Tree mechanisms, both of which achieve all but one among the set of axiomatic properties. Furthermore, our impossibility result suggests that this is optimal. We are encouraged that both of these mechanisms achieve the slightly weaker notion of unprofitable Sybil attack (USA). This shows that mechanisms can be designed that are provably resilient against basic forms of multi-identity attacks.

Any axiomatic approach based on a choice of desirable properties is questionable as different people may deem different properties to be more important. Indeed, as we point out, not all of the properties are equally relevant to the successful operation of an Incentive Tree scheme in practice. However, in ongoing work, we have been studying the effect of our mechanisms in practical deployments; and experience has strengthened our belief the properties defined in this paper are indeed of critical practical importance.