1 Introduction

Firms must regularly address the problem of providing incentives to employees when actions are not at all or hardly contractible. The standard approach in economics has been to focus on analyzing the optimal explicit incentive schemes: tying the level of the worker’s compensation to the amount of output produced, which serves as a (noisy) measure of employee effort.Footnote 1 However, the importance of fairness and social preferences—especially for the work relationship—has long been documented. Starting with Akerlof (1982), a literature has developed that considers gift-exchange as an alternative source of incentives in the workplace. According to this gift-exchange theory, firms can induce above-minimal effort from agents—even in the absence of explicit pay-for-performance incentives—since such-inclined agents may reciprocate generous wage payments with higher effort exertion: a positive wage-effort relationship takes effect as workers repay the initial “gift” of a generous wage with a “return gift” of above-minimum effort.

Recently there have been conflicting results on the significance of gift-exchange as a motivating force outside of the lab. While some papers have found a robust pattern of gift-exchange in field settings, others have failed to find the positive relationship of initial and return gift at all or have documented only effects that persist for very short periods; cf. DellaVigna et al. (2016). The varying effectiveness of gift-exchange in different settings suggests that the efficacy of gift-exchange incentives depends on details of the environment. We add to this research by providing results that suggest an avenue to reconcile those findings: In our field experiment we do not find evidence for an overall positive effort response merely from an initial wage gift. However, the gift’s efficacy is substantially improved if the manager benefits more strongly from a worker’s high effort. That is, gift-exchange can induce effort if workers are able to repay the gift to the manager. Moreover, we document that gift-exchange works more effectively with subjects that we classify as reciprocal via a personality test,Footnote 2 and that the efficacy of gift-exchange does not dissipate over the course of the experiment. We conclude from our results that while gift-exchange may not be effective as an incentive in all settings, it can be a powerful incentive device in the proper job context: such as in our setting when managers have performance-related incentives, and when it is directed to the right employees, who are most likely to be reciprocal.

For our field experiment we hired temporary workers for a data entry job.Footnote 3 The workers entered historical data from the 1849 Prussian Census. In total we had 59 workers entering data during a five hour shift in the Harvard Business School computer lab where the data entry took place. The job was advertised to the worker by a temp worker agency at the standard hourly wage of $13; however for 30 of the workers we increased the hourly wage to $18 upon their arrival. We explained to all of the workers that we were hired by two professors to organize the entry of these data. For 15 workers in each of the $13 and the $18 groups we emphasized the importance of them working hard for us by explaining that we would receive a bonus of 50% if the job was done ‘by the end of the week’ (Bonus treatment). For the ‘control’ groups in both of the $13 and the $18 conditions we did not inform the workers of this bonus (No Bonus treatment). We also asked the workers to fill out a short version of the “Big 5” personality test, to give us a measure of non-cognitive skills.Footnote 4

In line with previous field experiments, we do not find an effort increase in response to the higher wage in the No Bonus treatment. However, for the workers in the Bonus treatment, a higher wage leads to a significant increase in worker output. Hence, there is a strong complementarity between the wage gift and the resulting payoff for the manager.Footnote 5 Furthermore, when we separate workers based on their agreeableness (a personality trait associated with standard lab measures of reciprocity), we find that the positive effect of high wages on effort is driven entirely by highly agreeable (strong positive reciprocity) workers, with low reciprocity workers showing either a zero or a negative response to a higher wage. We consider the finding that the strength of gift-exchange is positively correlated with measures of reciprocal inclination as absolutely necessary to lend credibility to gift-exchange based explanations of motivation in the labor market. Though apparently obvious, to the best of our knowledge this study was, at the time of writing it in 2007/08, the first to document this fact in the field.

We also examine the strength of the gift-exchange response over time. In contrast to other studies, e.g. Gneezy and List (2006), we find no weakening of positive responses to wage gifts over time. To the contrary, any negative effects of the wage gift disappear in the later stages of the task, and we find an overall strongly positive effect of our treatment manipulations in the second half of the experiment.

We use two different measures of effort in our analysis: gross data entered, and an error-corrected measure of data entered. The estimates that were derived from these two measures are qualitatively and quantitatively very similar, which suggests that any effort responses work along the quantity margin only; the responses leave the quality margin of effort unaffected.Footnote 6

We can rationalize these results with an agency model that captures reciprocal preferences, where we show that there is complementarity between gift and ability to repay the gift. In the model a risk-neutral firm hires a risk-averse worker to exert non-contractible effort. The novel feature of the model is that the worker is reciprocal: the worker’s utility increases in the principal’s profit whenever the worker receives a rent in excess of his outside option. Thus, when the firm is generous to the worker by giving him additional compensation, the worker desires to provide in turn something of value to the firm. The worker’s reciprocal attitude can now be used by the firm to align the worker’s preferences with those of the firm, which thus generates intrinsic motivation. The comparative statics show that ceteris paribus the worker’s optimal effort choice increases in the initial wage gift and his ability to repay the gift. The corresponding cross derivative is also positive, which indicates the complementarity of the instruments.

An extensive body of evidence has developed that demonstrates reciprocal behavior and gift-exchange in laboratory experiments. Fehr and Gächter (2000) summarize results from earlier studies and highlight several key results: (i) Average wages in the experiments are above the minimal wages and leave workers with rents; (ii) There is a positive wage-effort relationship; and (iii) These results are robust to various institutions, to competition, and to high stakes.Footnote 7 The laboratory experiment by Hennig-Schmidt et al. (2010) is closely related to our paper. Hennig-Schmidt et al. (2010) present a real-effort laboratory experiment and show that a positive wage-effort relation as implied by gift-exchange prevails only if information on the manager’s surplus is provided to the experimental workers. This indicates—as is predicted by our model—that the manager’s surplus is an important determinant of the effectiveness of gift-exchange relations. Note, however, that Hennig-Schmidt et al. (2010) do not vary the surplus accruing to the manager nor do they collect the additional information necessary to test our hypotheses.

It is beyond the scope of this paper to give a comprehensive overview of the vast set of excellent papers on gift-exchange that have been written in the last decade. See DellaVigna et al. (2016) or Esteves-Sorenson (2018) for excellent overviews of this literature. These two papers also address the controversial discussion about the validity of the lab results for gift-exchange in the field. We take the mixed findings of these field studies as evidence that the efficacy of gift-exchange depends on subtle details of the field situation. While Falk (2007) finds strong evidence for gift-exchange in a field experiment with charitable donations, Gneezy and List (2006) argue that the effect of gift-exchange in the field is only minor, fast disappearing, and overall not a viable employment strategy. In Gneezy and List, students are hired for a day job in a library, and half of them get a surprise rise of their hourly pay. Gneezy and List document that, other than in our field experiment, there is only a short lived effect of this gift on the students’ effort. Overall the ‘firm’ would have fared better hiring more students for the lower wage rate. Kube et al. (2013) replicate the Gneezy and List study and also find no effect of a wage gift but document a strong negative effect in response to a wage cut. In a comparable design, Kube et al. (2012) document a strong positive effect of non-monetary gifts, such as wrapped thermos bottles, on students’ effort. Note that in neither of these cases were the subjects given any indication that the manager who provided the higher wage would benefit directly from increased productivity and the performed jobs were not ones where an employee would expect such a compensation structure. Becker et al. (2012) study a field experiment where a random sub-sample of participants in the Swiss Labor Force Survey received vouchers for training courses. The authors find evidence for long-term (six months) gift-exchange, as voucher recipients are more likely to participate in future survey waves. As the authors can track actual voucher redemption, they are able to document that this long-lasting gift-exchange relationship is most pronounced for the sub-group that had redeemed their vouchers.

In light of the above studies, the current study makes two contributions: On the one hand, as we also elicit measures of ability and personality traits, we were in 2007/2008 (to the best of our knowledge) the first to document the heterogeneity of the effect of gift-exchange across different types of employees. On the other hand, we add to the two “standard” features of a gift-exchange experiment—a surprise higher wage offer and the possibility for the worker to reciprocate by exerting more effort—the information that the task matters to the researchers and that the “managers” stand to benefit from a job that is well done by receiving a bonus. Hence, we believe, the fact that gift-exchange is supposed to be at work is very salient for the workers. This, in turn, offers a good reason for why the initial wage gift is given and what the appropriate reaction to the gift is. In some of the above studies this is clear to employees due to the circumstances, but sometimes it is not [as in Gneezy and List (2006) where the wage increase comes as a surprise]; hence it is not clear why subjects should be contextually aware of the fact that they are engaged in a gift-exchange relationship and that exerting more effort is in fact appropriate.

Bellemare and Shearer (2009) analyze gift-exchange within a real firm (where the value of output is clear to the workers). In their study, there is a surprise bonus for the workers in a tree planting firm in British Columbia. Their results indicate a 10% increase in worker productivity on average which slowly dwindles. Moreover the effect of the gift is more marked if the worker has been with the firm for longer. Hence, Bellemare and Shearer argue that spot-market field experiments only establish a lower bound of the effects of gift-exchange in real firms that are characterized by longstanding and ongoing relations that amplify the effects. Based on data from the same firm, Bellemare and Shearer (2011) develop a structural behavioral model to identify a worker’s optimal response to monetary wage gifts. They use data from two separate field experiments to estimate the model and simulate how workers would react to different wage gifts. They find that profit-maximizing gifts would increase profits under slack labor market conditions by up to 10% on average.

The key innovation of the current study was to take seriously an immediate implication of reciprocity based gift-exchange models; the importance of the ability of the agent to repay the gift to the principal. According to the recent comprehensive study by DellaVigna et al. (2016), there are only three other studies, next that address this issue: Englmaier and Leider (2012b), Kessler (2013), and DellaVigna et al. (2016). Englmaier and Leider (2012b) also analyze the importance of the ability of the worker to “repay the gift” to the manager in a real-effort laboratory experiment where they vary the wage and the effect of the worker’s effort on the manager’s payoff. They report results that are consistent with the core findings of the current study plus additional predictions about which is the marginal worker (in terms of ability) affected by their experimental variation and how different types of individuals—selfish and reciprocal—react to it. Kessler (2013) uses a laboratory experiment and finds, consistent with our results, that gift-exchange is more prevalent when worker effort is more efficient. Finally, as part of a rich set of findings, DellaVigna et al. (2016) document in a large scale natural field experiment that the return of effort to the employer—as is predicted by this study or Englmaier and Leider (2012a)—positively affects the efficacy of gift-exchange.

The rest of the paper is organized as follows: The next section describes the design of the field experiment. Section 3 derives the theoretical predictions and Sections 4 and 5 present and discuss the results. Section 6 concludes. Appendices 13 contain illustrations, derivations and additional tables.

2 Experimental Design

The field experiment took place on the premises of the CLER lab at the Harvard Business School, where we provided subjects with computer work stations. We ran four sessions in September 2007. Though subjects were situated within one room, the layout was such that they could not monitor each other’s work progress. Moreover, though we did not formally forbid communication, we did not note any signs of more than casual communication. Hence we conclude that there is only very limited scope for peer effects to affect our results. In total we had 59 participants; each of them worked for approximately five hours on a single day. The participants were hired via a temp worker agency that regularly works with the Harvard Business School.Footnote 8 We told the temp workers that we had been hired to organize a data entry project. These workers frequently work on similar data entry projects; hence there is no reason to believe that they suspected they were participating in a field experiment. Their job was to enter data from the 1849 Prussian Census into an Excel template.Footnote 9

We created four treatment cells—Low Wage/No Bonus, High Wage/No Bonus, Low Wage/Bonus, High Wage/Bonus—based on the wage level and the bonus information. Workers were either paid the standard wage ($13/h) as advertised by the temp agency, or were “surprised” with a higher wage ($18/h).Footnote 10 In the baseline No Bonus treatments, we told the participants only that we were hired “by two professors to organize the entry of these data”. In the Bonus treatment groups we additionally inform the temps that we get a substantial “completion bonus” if enough work gets done.Footnote 11 These “bonus” treatments indicate whether the worker is willing to expend effort that benefits. Assignment of participants into one of the treatment cells was random and executed by the temp agency.

From the temp agency we get demographic information: the gender, race, age, work experience, and student status of the workers, which we use as controls. Most important, we have a measure of the workers’ typing speeds (Typing Score), which we use to control for the temp workers’ differing typing abilities. This is a key determinant for productivity in this task. The final payment of the temp workers was done in cash directly at the end of each entry session.

Given that there are no previous papers that use this task or treatment variation, we were not able to calibrate the magnitude of the pay rise and the completion bonus in order to find the optimal combination to maximize gift-exchange; consequently our results should not be seen as an upper bound on the efficacy of reciprocal incentives. However, we are fairly certain that all participants considered the pay rise and completion bonus as “substantial”.

3 A Model of Reciprocal Motivation

Our experiment is not designed to differentiate exactly between different models of social preferences. Hence we do not interpret our findings as a strict test of our model, but rather consider the model to be a valuable frame within which to organize the data. We consider a simplified version of the model in Englmaier and Leider (2012a) where we solve the full moral hazard problem and derive the structure of the optimal contract in a standard principal agent problem with reciprocal agents. To lay out our model: We assume that there is a risk neutral manager who wants to maximize expected profits and one risk averse worker who cares about reciprocity. The worker can take an action (effort) \(a \ge 0\) with corresponding costs of effort \(c\left( a \right)\), with: \(c^{\prime }\left( a \right)>0, c^{\prime }\left( 0 \right) =0; \text { and } c^{\prime \prime }\left( a \right) >0.\)

The actions imply a respective expected return for the manager ER(a) with \(ER^{\prime }(a)>0, ER^{\prime \prime }(a)\le 0\). In order to capture our experimental variation we introduce the scalar M which reflects the monetary value of output: \(M \cdot ER(a)\) is the expected monetary gross return for the manager from action a.

A contract \((w,{\hat{a}})\) is a fixed wage payment w, as well as an unenforceable request for an action \({\hat{a}}\). In a real-world context we could think of \({\hat{a}}\) as an informal job description or a code of conduct. In the experiment we will interpret \({\hat{a}}\) as an exogenously given and commonly understood norm. Given our focus here on changes in behavior these details are not key to our results. While \({\hat{a}}\) is not binding, it serves to fix the worker’s beliefs about the manager’s intended generosity (since the expected utility of a contract depends on the worker’s action).

The worker’s inherent concern for reciprocity is measured by \(\eta \in [ 0, +\infty )\). The worker’s utility function—given that she takes action a, under the contract \(\left( {\tilde{w}}, {\hat{a}}\right)\)—is given by

$$\begin{aligned} U\left( {\tilde{w}}, a,{\hat{a}}\right) = u({\tilde{w}}) - c(a) + \eta \left( u({\tilde{w}}) - c({\hat{a}})- {\bar{u}}\right) \cdot M \cdot \left( ER(a) - {\tilde{w}}\right) , \end{aligned}$$
(1)

where \({\bar{u}}\) is the worker’s outside option in the labor market. The utility function capture—albeit in a simplistic form—the core idea of reciprocal motivation: If an individual has been treated kindly, she will want to reciprocate in kind.Footnote 12 The function consists of three parts: (i) utility from the monetary wage payment \(u({\tilde{w}});\) (ii) effort costs \(c\left( a\right) ;\) and (iii) reciprocal utility \(\eta \left( u({\tilde{w}}) - c({\hat{a}})- {\bar{u}}\right) M \cdot \left( ER(a) - {\tilde{w}}\right) ,\) where \(\eta\) measures the intensity of the reciprocal preferences.

A “generous” contract is one that provides a rent to the worker: an expected monetary utility in excess of the worker’s outside option. A more generous contract will induce the worker to feel more reciprocal, which here means that she will derive greater marginal and absolute utility from the manager’s profit. On the assumption that the contract is generous, the worker’s optimal effort choice \(a^*\) for a given contract is implicitly defined by the first order condition

$$\begin{aligned} \frac{\partial U\left( {\tilde{w}}, a,{\hat{a}}\right) }{\partial a}= - c^{\prime }(a^*) + \eta \left( u({\tilde{w}}) - c({\hat{a}})- {\bar{u}}\right) M \cdot ER^{\prime }(a^*)=0 . \end{aligned}$$
(2)

Applying the implicit function theorem we can derive the relevant comparative statics w.r.t. \({\tilde{w}}\) and M: They are positive, as is the cross partial w.r.t. \({\tilde{w}}\) and M, which indicates that they are complements.Footnote 13 Note that—since M and \(\eta\) (the concern for reciprocity) always appear together—the effect of varying \(\eta\) is the same as the effect of varying M. The following Lemma 1 summarizes these results.

Lemma 1

(Reciprocity) For a generous contract,\(\left( u({\tilde{w}}) - c({\hat{a}})- {\bar{u}}\right) > 0\), the worker’s optimal action\(a^*\)is implicitly defined by the first order condition

$$\begin{aligned} \frac{\partial U\left( {\tilde{w}}, a,{\hat{a}}\right) }{\partial a} = - c^{\prime }(a^*) + \eta \left( u({\tilde{w}}) - c({\hat{a}})- {\bar{u}}\right) \cdot M \cdot ER^{\prime }(a^*)=0. \end{aligned}$$
(3)

It is increasing in\({\tilde{w}}\). \(\nicefrac {\partial a^*}{\partial {\tilde{w}}} > 0\); increasing inM: \(\nicefrac {\partial a^*}{\partial M} > 0\); increasing in\(\eta\): \(\nicefrac {\partial a^*}{\partial \eta } > 0\); and\({\tilde{w}}\)andMare complements:\(\nicefrac {\partial ^2 {a^*}}{\partial {\tilde{w}}\partial M} > 0\).

The intuition for the complementarity is fairly straightforward from the utility function: Increasing the wage leaves a larger rent to the worker and increases the weight that she gives to the managers welfare. Due to the multiplicative structure, the worker finds it more attractive to work harder when she has a stronger impact on the manager’s surplus.

In contrast to the standard model of preferences (\(\eta =0\)) the first-order condition simplifies to

$$\begin{aligned} \frac{\partial U\left( {\tilde{w}}, a,{\hat{a}}\right) }{\partial a} = - c^{\prime }(a^*) < 0, \end{aligned}$$
(4)

which is always negative, since with flat wages the worker’s utility unambiguously decreases in her effort choice and her optimal action \(a_{standard}^*\) is trivially given by \(a_{standard}^* = 0.\) Increasing M or \({\tilde{w}}\) has no effect on \(a_{standard}^* = 0.\) This is summarized in Lemma 2:

Lemma 2

(Standard Preferences). \(a_{Standard}^*=0\), and the corresponding comparative statics are trivially given by

$$\begin{aligned} \frac{\partial a_{standard}^*}{\partial M}=0, \quad \frac{\partial a_{standard}^*}{\partial {\tilde{w}}}=0, \quad \frac{\partial ^2 a_{standard}^*}{\partial M \partial {\tilde{w}}}=0. \end{aligned}$$

Summarizing the results from our model, we can formulate the following predictions that we try to validate in our experimental analysis. The first three follow directly from Lemma 1:

Prediction 1

Effort is increasing in wage\({\tilde{w}}\).

Prediction 2

Effort is increasing in managerial payoffM.

Prediction 3

Managerial payoff and wage are complementary in their effect on effort,\(\nicefrac {\partial {a^*}^2}{\partial {\tilde{w}}\partial M} > 0\).

The next prediction is straightforward: reciprocal incentives via gift-exchange work better for more reciprocally inclined subjects. In particular, low reciprocity individuals should be unlikely to exhibit a positive response to high wages, as their utility from providing effort for the principal is very low compared to their cost of effort. High-reciprocity individuals, however, are more likely to have a sufficiently strong utility benefit from returning the gift so as to induce effort.

Prediction 4

The positive effort response to a wage gift or a manager bonus is more pronounced for more reciprocally inclined (higher\(\eta\)) subjects.

4 Experimental Results

Our main performance measure are the subjects’ data entry rates: the number of characters of data entered per minute.Footnote 14 As a robustness check, in our regressions we also report the accuracy-corrected data entry rate.Footnote 15 In Table 1 we present the mean and median entry rates for the whole shift. Subjects’ overall productivity suggests that offering a high wage in the No Bonus treatments had a negative impact on effort, while it had a positive effect in the Bonus treatments. We test this effect statistically in the analysis that follows.

Table 1 Data-entry rate (Chars/min) by treatment

Figure 1 shows how productivity—measured in 10 minute intervals—evolves over time in the four treatment conditions. Though the levels differ, there seems to be no difference in the time trend. There is learning in all four treatments for the first hour, then productivity is fairly flat until the end, where there is a steep drop in all treatments. This drop is mostly ’technical’: Subjects finished a line and did not start a new one that they were unlikely to finish before the end of their shift.

As can be seen in Table 2 there are considerable differences in underlying ability (the typing speed score) between treatments. Hence, a direct comparison of the overall productivity between treatments is somewhat misleading. We address this below by directly controlling for ability in our main regressions.

Table 2 Typing score
Fig. 1
figure 1

Performance over Time. The x-axis shows the 10 minute sub-periods and the y-axis shows the average data entry rates (characters/minute) by treatment. $13/Low refers to the $13 Wage/No Bonus treatment., $18/Low refers to the $18 Wage/No Bonus treatment, etc.

4.1 Regression Analysis of Treatment Effects

Our sample size in terms of participating subjects is only somewhat bigger than in preceding studies: e.g., Gneezy and List (2006) or Bellemare and Shearer (2009); but importantly, our design allows us to use detailed additional information to estimate more precisely the treatment effects. Most importantly we have a measure of data-entry ability from the temp agency. The agency has temp workers take a typing speed test upon hiring, and the agency made this information available to us. Since Table 2 indicates substantial heterogeneity in the underlying ability distribution across treatments we proceed by including controls for worker ability. Additionally, we have worker productivity in our task at 10-min and 30-min intervals. We will use the 10-min data as the units of observation in this paper. The results for the 30-min data are quantitatively similar but less precisely estimated and are available upon request.Footnote 16 Table 3 presents the results of a GLS estimate with a heteroskedastic panel structure and AR(1) errors. A Wooldridge test for serial correlation finds significant autocorrelation (\(p < 0.001\)) while a Likelihood Ratio test suggests panel heteroskedasticity (\(p < 0.001\)).

Table 3 Regression: performance (10-min. Periods)

The two alternate effort measures presented in specifications (1) and (2) in Table 3 give qualitatively and quantitatively very similar estimates, which suggests that the response to the treatment variations does not affect quality but only influences the quantity margin. Contrary to the fundamental gift-exchange intuition, we find a significant and negative effect of the wage gift on effort and no significant effect of the manager bonus. However, the interaction High Wage × Manager Bonus is significantly positive and large in size, which results in an overall positive gift-exchange in the Bonus treatment: The effect of a wage gift depends importantly on the characteristics of the job context, and there is a strong complementarity between the wage gift and the magnitude of the managerial payoff.Footnote 17

4.2 Treatment Effects by Agreeableness

As we have information on workers’ scores in a “Big 5” personality test, we can analyze how effects vary along this dimension. The “Big 5” personality factors capture five distinct aspects of an individual’s personality: extraversion; agreeableness; conscientiousness; emotional stability (also referred to as its inverse, neuroticism); and imagination (or openness to experience). The five factors are typically measured by having subjects read a series of statements, such as “I pay attention to details”, and rate them on a scale from “strongly agree” to “strongly disagree”, where each statement is associated with one of the factors. Extraversion represents an individual’s sociability and engagement with the external world. Extraverts are commonly seen as energetic and action oriented. Extraversion is positively associated with scale items such as “I am the life of the party” and negatively associated with items such as “I don’t talk a lot.” Agreeableness represents an individual’s interest in social harmony and getting along with others. Agreeable individuals are seen as helpful, trusting and optimistic. Agreeableness is positively associated with items such as “I take time out for others” and negatively associated with items such as “I am not interested in other people’s problems.” Conscientiousness represents an individual’s self-discipline and inclination to follow rules and instructions. Conscientious people are seen as honest, hardworking, able to control their impulses, and inclined to plan. Conscientiousness is positively associated with scale items such as “I am always prepared” and negatively associated with items such as “I shirk my duties.” Emotional stability reflects an individual’s calmness and low propensity to feel negative emotions. Conversely, neurotic individuals are emotionally reactive to stress and frustration, inclined towards a bad mood and anxiety. Emotional stability is positively associated with scale items such as “I am relaxed most of the time” and negatively associated with items such as “I worry about things.” Finally, imagination represents an individual’s attraction to unusual ideas, adventure, artistic expression and a variety of experiences. Imaginative people are seen as intellectually curious, sensitive to beauty, and willing to try new things. Imagination is positively associated with scale items such as “I am full of ideas” and is negatively associated with scale items such as “I am not interested in abstractions.” We use a standard 50 question scale, and Table 4 presents summary statistics for our sample. Our results are within the typical range and are similar to the results for a standard laboratory subject pool in our companion paper; Englmaier and Leider (2012b). For comparability between factors, in all of the analyses below we use z-scores to standardize each factor.

Table 4 Summary the personality traits of the subjects

In our study we identify subjects who score highly on the trait “agreeableness”, which had been shown experimentally to relate to standard lab measures of reciprocity; cf. Ben-Ner et al. (2004) or Ashton et al. (1998). While Ben-Ner et al. (2004) and Ashton et al. (1998) also find some evidence that “openness” and “emotional stability” may relate to reciprocity as well, the relationship between reciprocity and agreeableness was robust across specification and sample. The results in Englmaier and Leider (2012b) also show this correlation. Empirical evidence suggests that agreeable individuals not only reciprocate more, but are also more trustworthy and altruistic. However, our model of reciprocal workers is to be understood as an illustration of one particularly relevant motivational mechanism in our context.

As opposed to the trust game or other lab measures of reciprocity, personality tests such as the “Big 5” are quite common in the hiring practices of firms. In particular, high agreeableness corresponds with one of the criteria that Autor and Scarborough (2008) identify in the hiring practice of the firm that they study, where the firm gave hiring preference to applicants with positive z-scores for agreeableness, conscientiousness, and extroversion. The separation of our subjects along the agreeableness dimension—which proxies for a split by reciprocity—allows us to examine whether our effects are in fact driven by the reciprocal subjects, as is suggested by our model. Exploiting the population heterogeneity along the reciprocity dimension and in doing so testing a key tenet of the reciprocity models was generally not done in early studies on gift-exchange. This was generally due to a lack of data on the preference. Examples of more recent work that pays attention to this heterogeneity and that finds support for heterogeneous treatments effects in line with reciprocity models are Englmaier and Leider (2012b) and Cohn et al. (2015).

When we turn to the results of our study, first note that there is no correlation between our agreeableness measure and the Typing Speed Test Score (Spearmen \(\rho = 0.0482\), \(p = 0.7170\)). Table 5 shows our basic regression analysis for the sample split into highly agreeable (strongly reciprocal; top third) and low agreeable (weakly reciprocal; bottom two thirds) workers.Footnote 18 Again, there are substantial differences in the response to the treatment variation across the agreeableness dimension. The negative response to the wage gift in the No Bonus treatment is driven entirely by the low agreeableness workers. Additionally, the low agreeableness workers have no overall response to the wage gift in the Bonus treatment. However, note that the complementarity effect between higher wage and manager bonus is significantly positive and substantial for the low agreeableness group. While at first glance puzzling, this result is in line with theoretical arguments and empirical findings in Englmaier and Leider (2012b) and Englmaier et al. (2014): intuitively, low-agreeableness workers are not (necessarily) unresponsive to gift exchange; however, they need a bigger boost to react—as is provided by giving both the wage gift and the improved ability to reciprocate by the manager bonus. In contrast, high-agreeableness workers react already to either one in separation as indicated by the (admittedly noisily estimated) coefficients in the regressions. These high-agreeableness workers have also a significantly positive response in the Bonus treatment (indicated by the significant total effect reported at the bottom of Table 5). Therefore, high-agreeableness workers can be induced to exert significantly more effort than the control low-wage treatment; but they have in some sense a more gradual reaction to the changes in the work setting as compared to low agreeableness workers who react primarily to the combination of high wage and manager bonus. Overall, our results are consistent with our model that a response to a wage gift should come primarily from reciprocal workers. What remains puzzling is the strong negative effect of the wage gift among the low-agreeableness subjects. We will return to this topic in Sect. 5.

Table 5 Regression: performance (10-min. Periods) by agreeableness score

4.3 Treatment Effects over Time

Finally, we consider the question of how productivity evolves over time. As Gneezy and List (2006) document a fast disappearing gift-exchange pattern, we are in particular interested in how the effects of the treatment variations evolve over the course of the experiment. We split our data into first- and second-half observations and report in Table 6 the base regressions separately. Again, the qualitative results are robust to different divisions of the data.Footnote 19 For both treatments the effect of a wage gift has in fact a more positive effect in the second half of the experiment. The negative response to the wage in the No Bonus treatment only exists in the first half; it disappears in the second half. Similarly, the positive effect in the Bonus treatment is present only in the second half of the experiment. However, the strong complementarity between the treatments—High Wage × Manager Bonus—is very stable over time. This suggests that the positive response to gift-exchange does not have to be a short-lived phenomenon.

Table 6 Regression: performance (10-min. Periods) by 1st/2nd half

5 Discussion of the Results

Previous experiments on gift-exchange in the field studied a variety of job situations (e.g., one-time jobs versus ongoing jobs); job tasks (e.g., data entry, fund raising, and tree planting); and worker pools (e.g., students versus full-time employees). Our results suggest that it may not be surprising that the observed effect of surprise wage increases varies across these studies.Footnote 20

We demonstrate that a positive gift-exchange can exist, but that its presence depends on the characteristics of the job. Therefore we should not expect gift-exchange to be present in all job settings. In particular, the results from our field study suggest that the extent to which the manager will directly benefit when subordinates produce high output plays an important role for the efficacy of reciprocity and gift-exchange in the field. We therefore anticipate that gift-exchange should be most prominent in settings where managers have a strong direct benefit from employee effort, e.g. if they have performance-based monetary incentives. An analysis of the UK Workplace Employment Relations Study (WERS) by Englmaier et al. (2016) finds some support for this. Now with increasingly better available representative administrative data on HR practices it would be interesting to further study the richer predictions of reciprocity-based models for organizational design; cf. Englmaier and Leider (2012a).

The negative effect on effort in our No Bonus treatment is puzzling. It is possible that absent an indication that high effort is important to the manager the surprise high wage appears wasteful. Subjects might interpret the out-of-context wage increase as a mistake on our side, which would cast doubt on our managerial aptitude. From this they might have downwards updated also our ability or willingness to “monitor their work”. This indicates that the manager who has incentives that are related to output may put the decision to offer a higher wage into context and avoid this negative updating on aptitude.

While we did not ex ante predict that the effect of a gift would get stronger over time, a few explanations are possible: First, we observe some level of learning in all treatments. It may take some time before workers learn their cost of effort, which determines how willing they are to exert effort to provide a return gift to the principal. Additionally, perhaps gift-exchange raises productivity by increasing persistence, rather than maximal effort. As previously mentioned, we see no treatment effect on quality, which suggests that workers need not improve on all dimensions of effort. Last, workers may have been waiting to see if another surprise was coming before completely determining their effort level.

As in previous experiments—see for example DellaVigna et al. (2016)—we do not find that the increase in worker wages “pays for itself” in increased productivity: the 40% increase in wage garnered only a 10% increase in productivity. However, we were using completely flat wages to pay the workers. The theoretical model in Englmaier and Leider (2012a) suggests that the level of reciprocity that is needed to induce effort under a flat wage is generally larger than the amount of reciprocity that can reduce agency costs when paired with explicit monetary incentives. A flat wage is generically far from the optimal contract, and so it is quite possible that any given level of reciprocity could be much more powerfully leveraged with a more sophisticated incentive scheme. Bellemare and Shearer (2009) find that a fixed gift of $80 ( 40% of earnings) on top of the standard piece rate increased productivity by an amount worth $40. As in our study the size of the gift was chosen arbitrarily; hence fine tuning the mix of gifts and explicit incentives is likely to be critical. Going forward, establishing that gift-exchange works not only conceptually, but also can be profit-enhancing is one of the key challenges for this literature.

6 Conclusion

The importance of fairness and social preferences—especially for the work relationship—has long been documented in the lab. However, a number of studies have highlighted that the empirical importance of reciprocity as a motivating force in the field crucially depends on details of the environment. The wage gift has to stand in the proper context to generate a motivating effect. Based on a principal-agent model of reciprocal motivation, we argue that a key determinant of the efficiency of reciprocal motivation is the ability of an agent to repay a gift: The benefit accrues to the principal from high effort.

Our results confirm the importance of the specific characteristics of the job-setting in generating gift-exchange-based incentives. If the manager stands to benefit much from additional effort, there is a strong positive effort response; while in the absence of the managerial benefit, we find no or even a negative response to the wage gift. This indicates a strong complementarity between the wage gift and the managerial payoff in generating incentives for employee effort.

Additionally, we find that the positive response to high wages comes primarily from high-reciprocity workers (as identified by a personality test), while the negative response comes from low-reciprocity workers. We find no weakening of positive responses to wage gifts over time. To the contrary, any negative effects of the wage gift disappear in the later stages of the task and we find an overall strongly positive effect of our treatment manipulations.

Our study indicates that employing agents’ reciprocity as a part of a firm’s personnel policy is a viable alternative and can be successfully done. However, as highlighted by, e.g., Ichniowski and Shaw (2003) or Bartling et al. (2012), it is important that various complementary parts of the firm’s compensation and HR policy are coordinated to maximize the effect of reciprocity. For example, firms that wish to employ reciprocal incentives may want to select for reciprocal motivations during hiring. An interesting implication of our study is that performance-related pay for middle management should be part of a remuneration policy even absent moral-hazard problems at that hierarchy level.

Our results suggest several avenues for future research: Further empirical work could identify the optimal magnitude of the gift and the proper mix between reciprocal and explicit motivation to maximize the profitability of gift-exchange.