1 Introduction

Delegation is an important management tool to increase productivity by exploiting comparative advantage in skills (Holmström, 1977). Unfortunately, asymmetric information about individuals’ productivity is a significant barrier to efficient delegation. Communication, as a potential remedy, plays an essential role in delegation processes. In theory, the effectiveness of cheap-talk communication crucially depends on the alignment of interests held by the parties involved (Crawford & Sobel, 1982). Information only flows in equilibrium and can only improve efficiency if the objectives of sender and receiver are not perfectly opposed. With perfectly opposed objectives, in equilibrium, delegators will disregard messages due to the inherent incentive of applicants to lie.

It is unclear if the incentive to lie prevents information transmission in reality. There is considerable evidence that humans often reveal private information through messages in individual decision making (e.g., Fischbacher & Föllmi-Heusi, 2013; Gneezy et al., 2018) and in games (e.g., Gneezy, 2005; Hurkens & Kartik, 2009; Sutter, 2009) even if they are incentivized not to do so. A recent meta-analysis of 90 experimental studies on patterns of reporting behaviour shows that subjects forgo, on average, about three-quarters of the potential gains from lying (Abeler et al., 2019). With this evidence in mind, we ask if cheap-talk communication can make delegation more efficient even if the agents have opposing interests. In order to answer this question, we analyze behaviour in an experimental cheap-talk delegation game with real effort, where only equilibria without information transmission exist. We are particularly interested in the effect of different message spaces on information transmission and efficiency. We allow for gender and ethnicity stereotypes to impact delegation decisions by using avatars that indicate our participants’ gender and ethnic backgrounds.

We employ a one-principal, two-agents setting that captures the prevalent competitiveness among candidates in real-world delegation processes. Delegation processes we have in mind are outsourcing, delegating a task to consultants, or hiring an individual for a specific task. Our setting is less characteristic of delegation within a firm where there is more verifiable information and delegation happens repeatedly. Our delegation game is played once and learning is not possible. In our experiments, a delegator has to decide if she wants to delegate the task of adding numbers that pays a piece rate to one of two applicants or to perform the task herself. She makes the decision after receiving cheap-talk messages from applicants about their abilities. Applicants are paid a bonus if given the task and therefore have an incentive to send messages that achieve delegation. Delegators profit from delegating if the chosen applicant’s performance in the addition task exceeds their own to an extent such that improved performance at least covers the delegation bonus. Our design eliminates the effects of moral hazard, wealth effects and the delegators’ uncertainty about their own performance. Standard theory in the arising game predicts that messages do not contain information and thus cannot help improve efficiency.

Our treatments are designed to investigate if information transmission occurs contrary to the theoretical prediction and if the message space available to the applicants impacts the amount of information transmitted. In the baseline treatment, senders submit a natural number to claim how many correct sums they calculated when they previously completed the task. This treatment provides us with a benchmark to quantify the size and frequency of misreports and the magnitude of efficiency gains due to delegation. We observe a large portion of senders who truthfully report (almost 50%) their past performance, while those who lied only exaggerated modestly. The modest and systematic lying implies that messages contain information, which can increase social welfare if extracted and acted upon by delegators. Unfortunately, delegators are not able to use the information contained in the messages. Delegators act as if the messages did not contain information. The delegation option is still welfare increasing, as delegators, who know that they are bad at calculating sums, are more likely to delegate. We observe a significant efficiency gain of 13.6% compared to the welfare level that would have resulted without delegation. Applicants with avatars indicating an Asian ethnicity or a different gender than the delegator are more likely to gain delegation. In the absence of information extraction, stereotypes are behaviourally relevant.

In the Interval treatment, we introduce coarseness by partitioning the message space into intervals of a fixed length. We find that the amount of information contained in messages is not statistically different from the benchmark treatment. However, compared to the Number treatment, delegators make better use of the information by conditioning their delegation decisions on the messages in a profitable way. As a result, we observe an 11.8% efficiency improvement over the baseline treatment.

In the Text treatment, we enrich the message space by allowing for natural language in messages. In contrast to theory, we observe that free-text messages affect both messaging and delegation behaviour. After analyzing the content of messages, we conclude that senders use different messaging modes depending on their actual past performance. Delegators can extract additional information based the mode of messaging. This leads to an efficiency improvement of 15.8% compared to the baseline treatment.Footnote 1 Stereotypes do not influence delegation decisions in both the Text or Interval treatment.

Our paper is related to the growing experimental literature that studies cheap-talk games with asymmetric information. One research program investigates information transmission with imperfectly aligned incentives. Papers typically focus on the class of simplified one-sender, one-receiver cheap-talk games (e.g., Cai & Wang, 2006; De Haan et al., 2015; Dickhaut et al., 1995; Peeters et al., 2013; Sánchez-Pagés & Vorsatz, 2007; Wang et al., 2010). The main finding in this literature is the persistence of non-equilibrium behaviour. Compared to equilibrium, the senders’ messages contain too much information, and receivers rely too much on the messages being truthful. Some recent papers (e.g., Bayindir et al., 2020; Goeree & Zhang, 2014; Lohse & McDonald, 2021; Vespa & Wilson, 2016; Minozzi, 2018) investigate how adding an additional sender impacts information transmission. The results heavily depend on the specific setting.

Several recent studies explore the role of communication in other applied settings than in the delegation process. Lundquist et al., (2009) investigate how promises affect buyer and seller interactions in a bargaining game. They find that freely formulated messages lead to the fewest lies and the most efficient outcomes. Charness and Dufwenberg (2011) study if communication can alleviate hidden information problems embedded in principal-agent relationships and find that free-form communication is effective in promoting efficient contracting in some cases but not in others. Serra-Garcia et al. (2011) study information transmission in a public-good game with different levels of precision in language and find that leaders frequently use vagueness to hide inconvenient truths.Footnote 2

The remainder of the paper is organized as follows. The next Section explains the logic why standard theory predicts no information transition taking place. Section 3 explains the experimental design and lays out the hypotheses. The results and a brief conclusion follow in Sects. 4 and 5.

2 No information transmission in delegation with cheap talk

We begin by laying out the logic of why no efficiency-enhancing information transition occurs in the equilibrium of our cheap-talk game. Suppose that there are a delegator and two potential applicants. All three privately know their productivity for a specific task since they have performed it before. The delegator needs the task performed and can either do it herself or delegate the task to one of the applicants. The higher the productivity of the person performing the task, the higher is the revenue for the delegator. Applicants prefer to be chosen for the task as they then receive a bonus from the delegator. The delegator asks the applicants to send a costless, non-verifiable message. Upon receiving the messages, the delegator decides to whom to delegate or to perform the task herself. In this setting, efficiency-enhancing information transition is not possible, as low productivity senders have an incentive to imitate messages of high-productivity senders, which then implies that a rational delegator should disregard the messages. This result is independent of the message space applicants can use when sending their messages.

More formally, the argument goes as follows. Suppose there were two messages a and b, where the delegator is more likely to delegate to a person sending message a. Then nobody would ever want to send a message b. This implies that in equilibrium no two messages will be sent with positive probability that lead to different likelihoods of gaining delegation. In order to make sure that the delegator finds it optimal to make the same delegation decision regardless of the message sent, she must assign the same expected underlying productivity to all messages, which implies that messages cannot be informative. The interested reader can find a formal treatment of the problem in the Appendix A.1.

3 Experimental design

We present the design of an experiment by which we can test both the theoretical predictions and a variety of alternative hypotheses. We require a simple environment that allows us to investigate if and which kind of messages improve social welfare in a delegation environment. We decided to use a real effort task that requires non-negligible mental effort and inert ability instead of a task with stated effort and induced effort cost. This choice was guided by the aim to capture the salient factors of the delegation of tasks in real-world environments. The real-effort task chosen is the repeated addition of five two-digit numbers (see Niederle & Vesterlund, 2007), which has been shown to allow for plenty of variation in individual performance. The use of a real-effort task has advantages and disadvantages. By using actual ability in a real-effort task instead of inducing types, we surrender control over the distribution of types and prior beliefs. In exchange, we gain salience, as the messages subjects send and the beliefs they hold are concerned with the task they have experienced first-hand.

The participants complete the seven-minute addition task twice. After choosing an avatar that best represents gender and ethnicity and answering some priming questions (details on this follow below), participants are asked to complete the task for the first time. Completing the task allows participants to learn their abilities and form beliefs about their abilities relative to others. Participants are paid a piece rate for the number of sums solved in seven minutes. At this stage, participants do not know what is to follow.

Once the addition task has been completed for the first time (Task 1), new instructions explaining the delegation stage are distributed. Participants are randomly matched into groups of three. One participant of the group is randomly assigned the role of a delegator. The other two will play the role of applicants. The applicants are asked to send a message to the delegator, which signals their ability in the addition task. Depending on the treatment, the message can either be a number (sums solved in Task 1), one of a range of predetermined intervals (of sums solved in Task 1) or free text up to 400 characters. The delegator sees the avatars and messages of the applicants and then decides to delegate the addition task to one of the remaining two players or not. Delegation implies that the delegator’s performance in the already completed task is being replaced by the applicant’s performance in the future task.

Once the delegation decision has been made, all participants perform the addition task a second time (Task 2). The payment scheme is identical to that used the first time round. The number of solved sums is multiplied by a piece rate. All three participants, regardless of their role, receive the payment for their performance in Task 2. The payoffs for Task 1 depend on the delegator’s decision. In case no delegation has taken place, the payoff for Task 1 is the piece-rate payment for the participants’ own performance. If delegation has taken place, the chosen applicant receives a bonus paid by the delegator in addition to the own piece-rate pay. In return, the delegator’s performance in Task 1 is replaced by that of the chosen applicant in Task 2. Hence, a delegator increases overall efficiency if she delegates to an applicant who solves more sums in Task 2 than she did in Task 1. She improves her own payoff if the higher performance of the chosen applicant at least covers the bonus she has to pay. Table 1 summarizes the payoffs in case delegation takes place.

Table 1 Payoffs if the delegator delegates to applicant A

Our design has a variety of features. Paying a piece rate in Task 2 even if an applicant achieves delegation prevents moral hazard. Potential income effects are minimized by randomly choosing one task for payment. Moreover, we remove the potentially confounding factor of delegators’ uncertainty about their own ability by having delegation replace their past instead of their future performance.

In addition to the direct financial incentives, we introduce a social dimension by using a set of priming questions and avatars to make ethnic and gender cues salient. The priming questions sensitize subjects’ perception of gender differences through activating natural social identities.Footnote 3 After priming, eight avatars representing males and females of four ethnic backgrounds are shown. Participants are asked to choose one that best represents them.Footnote 4 Groups of three are formed randomly with the restriction that they are not single-sex. The delegator is randomly chosen from the gender that occurs twice in the group. We obtain groups with either a female or male delegator who can always choose between a male and a female applicant. Salient gender and ethnicity cues from the avatars allow for gender and ethnicity stereotypes to potentially influence delegation decisions. Table 2 summarizes the design and the rationale for its components.

Table 2 The stages of the experiment

3.1 Treatments

Our study consists of three treatments. In the baseline treatment, players send a precise message of the form “The number of correct answers I had in the addition task was x.” We call this baseline the Number treatment. The message space in the Number treatment is a copy of the type space and consists of non-negative integers up to a maximum. In this treatment, the message space has exactly the same cardinality as the type space, which is the minimum size for full information transmission to be technically possible.

In our second treatment, we partition the message space into intervals to study situations where it is impossible to communicate precisely. In the Interval treatment, players send a message of the form “The number of correct answers I had in the addition task was in the range from a to b.”. The intervals senders can choose are predetermined. There are different options about how to design the intervals. The intervals could be fixed relative to the natural numbers (i.e. everybody faces the same intervals, say 0 to 3, 4 to 7, etc.) or fixed relative to the performance of the delegator. Fixing the interval to natural numbers has the disadvantage that the inference problem the delegators face differs according to their past performance. Hence, we decided to fix the interval relative to the delegator’s past performance.

The delegators’ performance is located on the upper end of an interval, and the length of the interval equals the number of questions (i.e. four) required to pay for the bonus.Footnote 5 This design implies that delegation is profitable whenever the actual performance of the chosen applicant lies in an interval at least two above that of the delegator. This is the same for all delegators. An advantage of this design is that the location of the delegator’s performance within an interval can be eliminated as a cause for behavioural differences across delegators. In the Interval treatment, the message space is of lower cardinality than the type space, and full information transmission is technically impossible.

In our final treatment, we go the other way. Here, our message space exceeds the size of the type space, as we allow for free-form text messages. In the Text treatment, applicants can write a text message of up to 400 characters. Using natural language allows for more than just sending a message that communicates a number or a range of numbers. In standard game theory, message spaces that are at least as large as the type space are sufficient for maximum information transmission if preferences are aligned. However, previous experimental studies (Charness & Dufwenberg, 2006; Chen & Houser, 2017) find that free-form communication in written messages can improve cooperation by fostering trust in games with hidden action. The Text treatment allows us to test if the theoretically superfluous size of the message space helps information transmission.Footnote 6

3.2 Hypotheses

Theory predicts no information transmission in all treatments, which is at the heart of our Hypotheses.

Hypothesis 1::

Senders’ messages are not correlated with actual performance in any treatment.

Hypothesis 2::

Delegators in all treatments disregard messages and base their delegation decisions only on their prior beliefs.

Hypothesis 3::

Overall efficiency does not differ across treatments.

Hypothesis 4::

Avatars representing gender and ethnicity have no impact on delegation decisions.

In light of the recent experimental literature on lie aversion (see Abeler et al., 2019, for a metastudy) or bounded rationality in cheap-talk games (e.g., Kawagoe & Takizawa, 2009), we expect that messages will contain information. Messages contain information whenever they are correlated with actual past performance. Then a delegator who knows or guesses the link between messages and past performance can update her beliefs after receiving the messages. Following the literature on lying aversion (Gneezy, 2005; Gneezy et al., 2018), we expect systematic lies of limited size, which should result in messages containing information about performance.

Hypothesis A1::

Messages contain information about the senders’ productivity. The degree of information differs across treatments.

We do not have a clear prior on whether the coarser message space in the Interval treatment leads to more or less information in messages than the Number treatment. We conjecture that more information will be contained in messages in the Text treatment. The opportunity to send free text messages allows choosing a style of the message and not only the message itself. If applicants with different levels of past performance use different communication styles, then not only the message content but also the style contains information. Earlier studies show that humans avoid lying by using imprecise messages (Serra-Garcia et al., 2011; Wood, 2016), are able to express their social motivations (Cason & Mui, 2015), and make promises in free-form communication (Charness & Dufwenberg, 2006; Chen & Houser, 2017; Vanberg, 2008). Hence, we expect messages to contain the most information in the Text treatment.

Hypothesis A2::

Delegators make use of at least some information contained in the messages.

Messages containing information is not sufficient for information transmission. The delegators must also be able to extract the information. The degree of information extraction will depend on the delegator trusting in a relation between messages and performances to exist and on how accurate her guess about the nature of the relation is. If lies are modest, then information extraction depends largely on how much trust delegators put in the applicants’ truthfulness.

Past research suggests that we should expect more trust in treatments with restricted message spaces (Cai & Wang, 2006; Wang et al., 2010). Bounded rationality might play an important role here. Wood (2016) finds that the increased complexity of a vague message space makes subjects less sophisticated in their reasoning and leads them to trust more in messages compared to a message space that only allows precise messages. Therefore, we expect more information transmission in the Interval treatment than in the Number treatment. We further conjecture that the uncertainty of the location of the actual performance in a messaged interval might be more salient than the uncertainty about its truthfulness. Evidence suggests that people do not take strategic uncertainty sufficiently into account when it is confounded with risk (Huberman & Rubinstein, 2001).

Finally, we conjecture that subjects will make good use of messages in the Text treatment by picking up cues contained in the chosen format of messages. Research on deception shows that people are able to read cues and spot a lie in informal written communication (Chen & Houser, 2017) and face-to-face communication (Belot et al., 2012; Konrad et al., 2014). In sum, we conjecture the ranking of the degree of information transmission across treatments to follow the order Number<Interval<Text. As efficiency depends predominantly on information extraction if the information content of the message is comparable, we formulate the following alternative hypothesis on efficiency:

Hypothesis A3::

Social welfare is greatest in the Text treatment followed by the Interval treatment and the Number treatment.

Research in psychology has shown that there are relatively stable ethnic and gender stereotypes (e.g., Garg et al., 2018), such as “Asians are good at maths and women are not.” Additionally, there is an underrepresentation of females (e.g., Ortiz-Ospina, 2018) and ethnic minorities in leadership positions (Eagly & Chin, 2010). It is possible that these two facts are causally linked. Hence, we hypothesize that stereotypes activated by avatars and priming questions impact delegation decisions and negatively impact efficiency in our environment.

Hypothesis A4::

The likelihood of achieving delegation is influenced by the avatar of the applicant.

4 Results

Treatments were programmed using Z-tree (Fischbacher, 2007) and carried out at AdLab, the Adelaide Laboratory for Experimental Economics. Participants were recruited with the help of the online system ORSEE (Greiner, 2015). Overall, 342 subjects, which were predominantly university students, participated in the experiments and earned on average 15.4 Australian dollars.

The bonus for gaining delegation was 4 Australian dollars.Footnote 7 Subjects were also asked two belief-eliciting questions after they completed Task1. Firstly, we asked participants for their beliefs about the percentage of people who solved more correct sums in the session. This elicitation was incentivized by 5 Australian dollars paid to the person with the guess closest to the truth. The second question elicited the participants’ beliefs about how well they will be able to perform the same task the second time.

We ran five sessions for each treatment with between 15 and 27 subjects per session. Overall, we had 117 individuals in the Number treatment, 117 in the Interval treatment, and 105 in the Text treatment. The majority of participants (324 out of 339) chose avatars that coincided with their real gender. On average, participants calculated 13 correct sums in Task 1 and 16 in Task 2 and correctly anticipated their performance increase. We also observed some aggregate overconfidence. On average, participants believed that only 38% of the population outperformed them. There is no statistical difference in performances and beliefs between male and female participants (two-sided rank-sum test, \(p>0.1\)).

4.1 Information content of messages

There are 71, 70, and 65 available sender-subjects in the Number, Interval, and Text treatments, respectively.Footnote 8 Figure 1 plots the applicants’ messaged performance against their actual performance. Note that numerical information in messages are represented as points in the Number treatment, intervals in the Interval treatment, and a selection of points in the Text treatment (only subjects who mentioned a number are represented). Hypothesis 1, drawn from standard game theory, implies that messages and actual performance should not be correlated. Rejecting Hypothesis 1, we observe a very regular pattern. The majority of the data points lie on or slightly above the 45-degree line, indicating truthful reports or moderate but systematic lies. In all three treatments, there is considerable information contained in the messages.

Fig. 1
figure 1

Messaged performance vs. actual performance. Sample size: 71 in the Number treatment, 70 in the Interval treatment and 30 in the Text treatment

The intuition gained from a visual inspection is confirmed by regression models that control for subjects’ prior beliefs about their relative performance and individual characteristics. Table A.1 in the Appendix presents estimation results from OLS regressions in the Number and Text treatments and an Interval regression in the Interval treatment. Beyond the insights from plotting, we learn that the coefficients on the prior belief variable for all treatments are close to zero and not significant. This suggests that subjects did not try to improve their delegation chances by increasing the size of their lies whenever they believed to be relatively noncompetitive.Footnote 9

Next, we compare the quality of the information in messages for the two structured message spaces, i.e., Number vs Interval. The difficulty of comparison stems from the different measurement units between the two treatments. While the units in the treatments differ, the messages in both treatments can be ranked. Hence, we calculate the Spearman rank correlation coefficient. We obtain a correlation of 0.818 in the Number and 0.755 in the Interval treatment.Footnote 10 The correlation seems to be slightly larger in the Number treatment, but the difference is not statistically significant (p > 0.1, 1000 bootstraps). For robustness, we map numbers into corresponding intervals and calculate the size of a lie as the difference between the messaged performance band and the actual performance band. The two distributions for the two treatments are not statistically different from each other (Kolmogorov-Smirnov test, \(p>0.1\)). Figure A.1 in the Appendix shows histograms of the size of lies for the two treatments.

Lastly, we analyze the information contained in the kind of message an applicant sends in the Text treatment. We observe the following regularities: 31 subjects mentioned their precise performance in Task 1; 25 subjects expressed their competence by mentioning their mathematical ability or their math-related background; 24 subjects promised either better future performance, more effort or more earnings to the delegator; 20 subjects tried to reduce the social distance by using words such as “we”, “trust”, “believe”, “help” and their synonyms. There are 11 messages that do not belong to any of the categories above and only contain babble that is unrelated to the situation. Accordingly, we use five dummies, i.e. Number, Ability, Promise, Trust, and Babble, to record if a particular way of communication was observed in a text message. Note that these categories are not exclusive. Many subjects used multiple kinds of messages in their communication. The only exclusive category is babbling. Table 3 reports the average performance conditional on a particular messaging type used. For later use, we also report the frequency of being chosen for delegation in each case.

Table 3 Average performance conditional on verbal cues

It is noticeable that subjects who were babbling performed significantly worse than those who did not (rank-sum test, \(p=0.02\), two-sided). We also observe that subjects who mentioned their past performance performed significantly better than those who did not (rank-sum test, \(p=0.03\), two-sided). Subjects who made a promise had performed marginally better than those who did not (rank-sum test, \(p=0.07\), two-sided). However, verbal cues such as signalling trust or commenting on general ability were not informative with respect to average actual performance (rank-sum test, \(p>0.1\), two-sided).

Result 1:

Messages contain a substantial amount of information in all three treatments.

Result 2:

The amount of information contained in the messages is similar in the Number and Interval treatments.

Result 3:

Free-form text messages contain some additional information as low-performance senders tend to either exaggerate less or avoid outright lying by babbling. Observing if somebody sent a message mentioning past performance or a promise contains some further information.

4.2 Delegation decisions

Delegators who knew their past performance were asked to estimate the percentage of participants that outperformed them and received at most two cheap-talk messages. We define a message as potentially “profitable” if the messaged performance is strictly greater than the delegator’s own performance plus the bonus. The variable “Message” counts the number of profitable messages. Table 4 reports the estimated coefficients and average marginal effects from probit models that predict delegation.

Table 4 Probit regressions on delegation decisions

The regressions show that messages and beliefs significantly impact delegation behaviour in some treatments but not in others.Footnote 11 In line with Hypothesis 2, subjects ignore messages and resort to their initial belief about their relative performance in the Number treatment. A ten percentage-point increase in the subjective belief regarding the percentage of people who can calculate more sums increases the delegation likelihood by about 6%. The fact that the number of potentially profitable messages is irrelevant once we control for beliefs implies that delegators are not able to extract information from the messages in the Number treatment. In contrast, subjects made good use of messages in the remaining two treatments, just as our alternative Hypothesis A2 conjectured. The average frequency of delegation significantly increases by 47% if subjects receive two profitable messages in the Interval treatment and by 68% if subjects receive one profitable message in the Text treatment.Footnote 12

Delegators seem to rely on the messages even a bit too much. Despite the fact that the messages contain information, there is still some noise. Therefore, rational updating implies that the prior belief should still have an impact on the decision. Our regressions show that it does not. Table 3 where we summarized the information contained in the communication style shows that delegators could extract some of it. The third column contains the delegation fractions depending on a message containing a particular mode of communication or not. Delegators never delegated to people who babbled and more often delegated to applicants who mentioned a precise past performance than those who did not. On average, this delegation behaviour is consistent with delegators extracting and responding to the information contained in the applicants’ message styles.

Result 4:

Information contained in messages is not transmitted in the Number treatment. Information is transmitted in the Interval treatment and even more so in the Text treatment.

Result 5:

Subjects in the Interval treatment are more likely to delegate if they receive two potentially profitable messages.

Result 6:

Subjects in the Text treatment never delegate if they receive babbling messages and typically delegate if they receive at least one profitable message.

4.3 Stereotypes

The eight avatars that we used span two dimensions: gender and ethnicity. There are four male and four female avatars representing four ethnicities, each. As a base, there is an ethnicity-neutral avatar, which could be White, Hispanic, Mediterranean, Middle Eastern, or Indian. About 47% chose the neutral avatar. Then there are three strongly ethnic avatars for each gender, which contain stereotypical elements for Asian (25%), White (22%) and Black (6%). To investigate the impact of stereotypes, we run Probit regressions where gaining delegation is the dependent variable. The independent variables of interest are gender and ethnicity dummies. We include a dummy that indicates if the applicant in question has sent the highest unique profitable message. Adding this dummy implies that we assume that a person who is not guided by stereotypes does not delegate to anybody else than to the person with a message that is higher than that of the other sender and profitable if true. Our results are robust to the use of other variables to control for messages.

Table 5 Probit regressions of the likelihood of being delegated to

Table 5 reports average marginal effects for two estimated specifications for each treatment. We find that stereotypes do not play a large role overall. Messaging has a much larger impact on all treatments. Sending the uniquely highest profitable message increases the probability to obtain delegation by around 0.21 in the Number and Interval treatments, and up to 0.76 in the Text treatment. Stereotypes for gender or ethnicity only play a role in the Number treatment. This is consistent with the observation that delegators in the Number treatment were not able to make good use of the messages as they were too afraid of facing lies. This leaves room for using other available information for the delegation decision. In the Number treatment, participants with avatars indicating an Asian ethnicity have an increased probability of being delegated to. While we don’t find outright gender discrimination, indirectly gender still plays a role in that treatment. Everything else equal, delegators are significantly more likely to delegate to the opposite gender. While this seems counter-intuitive, there is a simple explanation. The participants who delegate are those who believe that their ability is low compared to the other participants. If these participants wrongly extrapolate from their own gender-performance relation to others, then they should delegate more often to the opposite gender. This effect is mainly driven by women delegating to men, as there is not a single case where a woman delegates to a woman. This is consistent with the findings in Bordalo et al. (2019) who find strong evidence that stereotypes play a role in females’ beliefs only.Footnote 13 We also find that females secure delegation more often than males in the Text treatment. This is not a stereotype effect and can be explained by females making more claims about their mathematical abilities, which yields higher delegation rates.

Result 7:

Stereotypes only influence the delegation decision in the Number treatment, where delegators are not able to make good use of the information contained in the messages.

4.4 Social welfare

It remains to be checked if the varying degree of information transmission translates to efficiency differences. The average observed delegation frequency is 0.282 in both the Number and Interval treatments and 0.371 in the Text treatment. Recall that we calibrated the value of the bonus to target approximately 50% of rational delegations if individual performances were known. Due to asymmetric information embedded in the process of delegation, the observed average delegation frequencies across treatments are significantly below 50% (two-sided binomial probability test, \(p<0.01\)). In what follows, we construct an efficiency ratio to measure the relative degree of efficiency in the treatments. Delegation can improve social welfare only if there is one player, whose performance in Task 2 is better than the delegator’s performance in Task 1. If the delegator always allocates the task to the most capable person, delegation is fully efficient. We normalize the efficiency measure by dividing the performance implemented by the delegator by the best performance in the group. In this way, we can compare the average efficiency ratios across treatments. The ratio can be interpreted as the average fraction of the available surplus extracted. If the delegator decision resulted in the task being performed by the person solving the most sums, all surplus is extracted and the ratio is one. If the delegator implements the performance of somebody who only solves half as many sums as the best person in the group, the ratio becomes 0.5.

We also compare the actual average efficiency ratio to two counterfactual scenarios to assess the efficiency gain through delegation based on communication. In the no delegation benchmark, we calculate the average efficiency ratio that would have occurred if delegation had not been possible. Our second benchmark calculates the constrained optimal efficiency ratio. Note that a delegator has to pay a delegation bonus to the chosen applicant whenever she delegates. Hence, we exclude cases where delegation would improve social welfare (i.e. there is an applicant with higher performance than the delegator), but the difference is not high enough such that it pays for the fully informed delegator to delegate. The constrained optimality benchmark represents the efficiency ratio that would have resulted from rational delegation under perfect information.

Fig. 2
figure 2

Average efficiency ratio in four scenarios by treatments. \(\left[ 1\right]\) Efficiency ratio is calculated at group level and averaged over all groups within each treatment. \(\left[ 2\right]\) Sample size is 39 for the Number treatment; 39 for the Interval treatment and 35 for the Text treatment. \(\left[ 3\right]\) We only consider 206 available players’ performance in Task 2 in the efficiency calculation for the three cases

Figure 2 shows the average efficiency ratios by treatment calculated for (1) the no delegation scenario, (2) the actual experimental data, and 3) the constrained full information scenario. As expected, the actual efficiency levels in both the Interval and Text treatments are higher than in the Number treatment. The observed difference is significant for the Text treatment (\(p<0.024\), rank-sum, two-sided) and weakly significant for the Interval treatment (\(p<0.073\), rank-sum, two-sided). Compared to the no-delegation benchmark, all treatments achieve a significantly higher efficiency ratio (\(p<0.05\), sign-rank, all two-sided). Hence, we reject Hypothesis 3 and find support for alternative Hypothesis A3. However, we still observe a significant efficiency loss due to asymmetric information compared to the constrained full information scenario (\(p<0.01\), sign-rank, two-sided) in all treatments.

Result 8:

Allowing for delegation and one-way communication improves welfare in all treatments. Free-form communication achieves significantly higher efficiency than precise communication. Coarse communication attains a weakly significantly higher efficiency than precise communication. With cheap talk alone the constrained full information efficiency level can not be attained.

5 Conclusion

Our paper experimentally investigated if and how one-way cheap-talk communication can facilitate efficient delegation. We were particularly interested in testing whether the available message space impacts information transmission, where in theory, misaligned preferences do not allow for information to be transmitted in equilibrium. We found that information transmission occurred and that the message space applicants were allowed to use affected both the amount of information contained in applications and the efficiency gains from the delegation. Overall, allowing for free-text messages yields the best result. This is in contrast to the game-theoretical prediction that in cheap-talk and signalling games, nothing can be gained from extending the size of the message space beyond that of the type space. Applicants convey information not only by what they say but also by how they say it, and delegators can extract at least some of this information.

While our study shows that cheap talk can increase efficiency in a delegation scenario, the extracted surplus is still well short of what could be achieved without information frictions. We conclude that for many delegation situations, the delegator is likely better off requiring the applicant to send a costly signal (such as a lengthy proposal, response to selection criteria or the completion of training or education). The resulting signalling game will probably improve information transmission. However, this comes at a cost, as the signalling wastes resources and dissuades good applicants from applying. We leave the analysis of the circumstances under which signalling is more efficient for future research.