It is often of interest to elicit beliefs (or subjective probabilities) from populations that might include naïve participants. Unfortunately, belief elicitation mechanisms are typically assessed under the assumption that all people respond optimally to incentives. This paper relaxes that assumption, and evaluates the performance of two elicitation mechanisms using laboratory experiments with a subject population that potentially includes naïve participants. These mechanisms, proposed by Karni (2009), are valuable because their incentive compatibility does not require the strong assumptions that (i) utility functions are linear in money (i.e., risk-neutrality) or (ii) probability weighting functions are identity transformations (i.e., expected utility).

We denote Karni’s (2009) mechanisms as “declarative” and “clock.” In the declarative mechanism, the respondent declares directly her subjective probability. In the clock mechanism, she participates in an ascending clock, where the only other participant is a computerized player who exits stochastically. The clock stops when either the human or computer participant exits, or when the clock reaches its terminal value, whichever occurs first. The subject’s stopping point represents her belief. Both mechanisms provide incentives to encourage respondents to report beliefs truthfully.

An important advantage to comparing mechanisms using laboratory experiments is that subject populations typically include mixtures of “sophisticated” and “naïve” types, in the sense that some but not all subjects respond optimally to the incentives of the mechanism (see, e.g., Andreoni 1995; Houser and Kurzban 2002; Houser et al. 2004). In an environment with a mixture of types, we show that the clock can demonstrate an advantage in elicitation accuracy (under certain conditions) because it censors naïve responses. Indeed, our experiment’s results confirm that in our environment the clock mechanism does elicit beliefs significantly more accurately than the declarative mechanism. Moreover, our analysis reveals that the advantage of the clock mechanism is not solely due to censoring. Rather, we find respondents are more likely to adopt dominant strategies under the clock. These findings have substantial practical value to anyone interested in eliciting beliefs from representative populations, a goal of increasing importance, for example, when designing large-scale surveys.

The Karni (2009) mechanisms we study depart from the widely used proper scoring rules (e.g., Nyarko and Schotter 2002; Palfrey and Wang 2009).Footnote 1 Although popular, the incentive compatibility of proper scoring rules does not hold when the respondent is not risk-neutral or when s/he deviates from expected utility maximization. In light of this, Offerman et al. (2009) proposed that one first estimate risk attitudes and then adjust elicited beliefs accordingly. Alternatively, Andersen et al. (2010) suggested that one estimate jointly subjective beliefs and risk attitudes using maximum likelihood methods.

Allen (1987) was first to generalize the quadratic scoring rule by utilizing binary lottery payoffs to induce risk-neutrality.Footnote 2 McKelvey and Page (1990) implemented a similar generalized quadratic scoring rule in an early laboratory study of belief elicitation. Recently, Schlag and van der Weele (2009) extended Allen’s (1987) mechanism to elicit parameters characterizing subjective probability distributions. Hossain and Okui (2011) provided empirical results showing that quadratic scoring rules with and without binary lottery payoffs perform equally well.

To the best of our knowledge, no previous empirical comparison of Karni’s (2009) mechanisms has appeared. Despite the potential equivalence of the two mechanisms,Footnote 3 substantial evidence suggests theoretically equivalent mechanisms can perform differently in practice (e.g., second-price sealed-bid and English clock auctions such as reported by Kagel et al. 1987; Kagel and Levin 1993, 2009; Rutström 1998; Harstad 2000). In light of this, it seems of significant practical importance to investigate whether and how elicitations from declarative and clock mechanisms might differ.Footnote 4

It is worthwhile to note that the clock and declarative mechanisms are not isomorphic in theory. The reason is that the clock mechanism fails to elicit an exact probability whenever the computerized bidder is the first to exit, because the participant did not have an opportunity to respond. At first glance, this data censoring may appear to be a source of inefficiency. While this is possible, it turns out that if there exist both naïve and sophisticated respondents, the clock can be structured so that it is more likely to censor naïve than sophisticated responses. As a result, the clock can be more accurate than the declarative mechanism.

The contributions of this paper are threefold. First, by focusing on novice respondents who are likely to submit naïve responses, we provide practical guidance for belief elicitation in contexts including large-scale survey environments. Indeed, in recent years large-scale belief elicitation (presumably from novice respondents) has become a particularly active area (e.g., Manski 2004; Bellemare et al. 2008). Our investigation sheds light on how to implement incentive-compatible belief elicitation efficiently in environments that include noisy responses.Footnote 5

Second, our results contribute to the applied mechanism design literature by demonstrating another environment where the (ascending) clock maintains its advantage in inducing truth-revealing dominant strategies. For example, it is well known that the equivalence of sealed-bid second-price auctions and English clock auctions quickly breaks down in practice, in the sense that bids in English clock auctions are much closer to the truth in both induced (Kagel et al. 1987; Kagel and Levin 1993, 2009; Harstad 2000) and home-grown value settings (Rutström 1998).

The third contribution of this research is that it studies belief elicitation when participants are endowed with objective beliefs.Footnote 6 This allows us to shed light on the performance of the mechanisms absent noise due to variations in subjects’ abilities to predict uncertain events. Garthwaite et al. (2005) point to the importance of doing this when they write, “it is important to distinguish between the quality of an expert’s knowledge and the accuracy with which that knowledge is translated into probabilistic form,” a distinction Winkler and Murphy (1968) refers to as “substantive goodness” and “normative goodness” respectively. Our subjects are perfectly informed and thus possess “substantive goodness,” so that our data imply the clock mechanism is better than the declarative at facilitating probabilistic formulations by novice respondents.

The paper proceeds as follows. Section 1 reviews Karni’s (2009) theory, Section 2 formulates our hypothesis, Sections 3 and 4 report experimental design and results, Section 5 discusses practical details regarding field implementation of the mechanisms, and Section 6 concludes.

1 Review of Karni’s (2009) theory

This section briefly reviews Karni’s (2009) mechanisms. In Savage’s (1954) framework, an individual holds the belief that an event E will occur with probability π(E). If E occurs, the individual receives the prize x; otherwise, she receives y (x>y). We call a mapping between the occurrence (and non-occurrence) of an event and monetary payoffs a “bet,” denoted by β: = x E y.

Consider a lottery that pays x with probability r or y with probability 1−r; denote this lottery by L(r). The number r is randomly selected from a uniform distribution on [0, 1]. The individual knows the distribution, but she does not know r when she makes her decision.

1.1 Declarative mechanism

The individual submits a decision, μ∈[0, 1], which is compared with the random number r. If μ ≥ r, she plays the bet β; if μ<r, she plays the lottery L(r).

Dominant strategy

Karni (2009) demonstrates that the unique dominant strategy in this mechanism is to report truthfully: \( \mu = \pi (E) \). Doing so guarantees that the individual obtains either the bet β or the lottery L(r), whichever has the higher probability of winning the prize x. The individual has no incentive to report a number greater than the truth, because as soon as the random number r falls between the truth and her report, \( \pi (E) < r < \mu \), she receives the bet β, and forgoes the lottery L(r) that has a higher winning probability. The same logic applies when her report is smaller than the truth.

1.2 English clock mechanism

In the English clock auction mechanism, the individual competes with a dummy bidder and knows that the dummy bidder exits the auction at (unknown) number r. The clock starts at 0 and rises continuously as long as both the individual and the truth-revealing dummy bidder are “in the auction.” The clock stops when at least one bidder drops out, or when the clock reaches 1, whichever occurs first. If the individual is the first to exit, she receives the lottery L(r); if the dummy bidder exits first, the individual receives the bet β.

Dominant strategy

Following Karni’s (2009) argument, the dominant strategy is to stay in the auction as long as the clock is below π(E), and exit exactly at π(E).

1.3 Assumptions

For Karni’s (2009) mechanisms, as well as proper scoring rules, a necessary condition for incentive compatibility is the no-stakes condition (Kadane and Winkler 1988). The no-stakes condition requires the wealth of an individual, excluding elicitation-related payoffs, to be independent of the occurrence (or nonoccurrence) of the event.Footnote 7

Incentive compatibility in Karni’s (2009) mechanisms requires also that an individual’s preferences exhibit dominance and probabilistic sophistication (Machina and Schmeidler 1992).Footnote 8 This condition is weaker than the requirement that preferences satisfy expected utility.

2 Comparison of the mechanisms

2.1 Censoring can lead to greater accuracy in elicitations

In this section we demonstrate that, under certain conditions, the clock’s censoring can lead to accuracy advantages over the declarative mechanism. We say that a mechanism is more accurate if its fraction of truthful elicitations is greater. Assuming that people use identical decision strategies with the two mechanisms,Footnote 9 the general conditions under which the clock displays greater accuracy are that (i) the population includes a sufficient fraction of naïve decision makers and (ii) the clock censors sufficiently few “optimal” decisions. We show below that these specifics can vary depending on the nature of the environment.

To develop this point, suppose there are two types of respondents: i) truth-revealers who report π(E) and ii) naïve agents whose responses are characterized by a distribution with c.d.f. F n (•). Let the proportions of the optimal and naïve types be α and 1−α respectively (\( 0 < \alpha < 1 \)).Footnote 10 Suppose also that the random number r (the probability of winning the lottery prize) has continuous and strictly increasing c.d.f. F r (•).

By definition, the accuracy of the declarative mechanism equals α, the fraction of optimal decisions in the population.

The accuracy of the clock mechanism is given by the following expression:

$$ \frac{{\alpha \left( {1 - {F_r}\left( \pi \right)} \right)}}{{\alpha \left( {1 - {F_r}\left( \pi \right)} \right) + \left( {1 - \alpha } \right)\left( {1 - \mathop{\smallint }\nolimits_0^1 {F_r}(u)d{F_n}(u)} \right)}} $$

The numerator is the fraction of optimal decisions that are not censored by the clock. The denominator is the total fraction of non-censored decisions. Thus, the value of the expression is the fraction of truthful elicitations in the population.Footnote 11

It immediately follows that the clock is more accurate than the declarative mechanism when the following condition holds:

$$ \frac{{\alpha \left( {1 - {F_r}\left( \pi \right)} \right)}}{{\alpha \left( {1 - {F_r}\left( \pi \right)} \right) + \left( {1 - \alpha } \right)\left( {1 - \mathop{\smallint }\nolimits_0^1 {F_r}(u)d{F_n}(u)} \right)}} - \alpha > 0 $$
(1)

Inequality (1) makes clear that whether the clock holds an accuracy advantage in relation to the declarative mechanism depends critically on the fraction of optimal decision makers α as well as the value of the true belief π in relation to the distribution of the random number r (which influences the fraction of censored optimal decisions).

2.2 Experiment hypothesis

Our interest is in testing the following hypothesis:

Hypothesis

Beliefs elicited using the clock mechanism are more likely to be accurate than beliefs elicited using the declarative mechanism, especially with novice participants.

To develop this, suppose now that F r (•) and F n (•) are both U(0,1).Footnote 12 Under these assumptions it is easy to show that inequality (1) simplifies to

$$ \frac{{\alpha (1 - \pi )}}{{\alpha (1 - \pi ) + \left( {1 - \alpha } \right)/2}} - \alpha > 0 $$
(2)

Given \( 0 < \alpha < 1 \), (2) holds if and only if

$$ 0 < \pi < 0.5 $$
(3)

Thus, the clock has an advantage in this environment when true beliefs are less than 0.5, and not everybody makes optimal decisions. This result guides our experiment design, as we detail below.

3 Experiment design and procedures

A key feature of our design is that we endow subjects with objective beliefs. We made this choice for two reasons. First, as noted earlier, our goal is to assess the mechanisms’ ability to facilitate accurate translation from participants’ knowledge to probabilities, so we eliminate differences in quality of their knowledge. In addition, doing this better connects our research to second-price and English clock auctions, where participants typically make decisions using known (induced) values. A transcript of subjects’ instructions can be found in the Appendix.

3.1 Declarative mechanism

Endowed belief = 0.2

The subject is presented with two opaque bags (physical; made of cloth), bag A and bag B. She knows that bag A has 10 chips in total: 2 white chips and 8 black chips. She also knows that bag B also has a total of 10 chips of the two colors, but the number of white chips (denoted by R) is equally likely to be any integer from 1 to 9.

The participant submits a number between 1 and 9 (inclusive; integer onlyFootnote 13) on a computer terminal.Footnote 14 This number is then compared with R, i.e., the number of white chips in Bag B. If the submitted number is greater than R, the subject draws a chip from bag A; otherwise, she draws a chip from bag B. In either case, the subject is paid $10 if she draws a white chip, and is paid $1 if she draws a black chip.Footnote 15

Endowed belief = 0.3

This proceeds exactly as the above procedure, except now there are 3 white chips (and 7 black chips) in bag A.

Dominant strategy

Take bag A as the default choice; the declarative mechanism is effectively asking the subject, “What is the minimum number of white chips in bag B so that you are willing to switch to bag B?” The dominant strategy is to declare either the number of the white chips in bag A, or one more than the number of white chips in bag A.Footnote 16 The presence of two equally advantageous actions stems from our using a discrete state space.

3.2 Clock mechanism

Endowed belief = 0.2

Bag A and bag B are exactly the same as in the declarative mechanism with the endowed belief of 0.2. Instead of declaring a number, the subject participates in a computerized clock auction. Similar to an English clock auction (e.g., Kagel et al. 1987), it starts as the computer screen displays the number 1 for 5 seconds, and then the number 2 for 5 seconds, and so on. The subject exits the auction by pressing the space key. The clock stops when the subject exits, or after reaching and displaying the number R for 5 seconds, whichever comes first.Footnote 17 If the clock stops due to reaching number R, the subject draws a chip from bag A; if the clock stops due to the subject’s exit, she draws a chip from bag B.Footnote 18 In either case, the subject is paid $10 if she draws a white chip and $1 if she draws a black chip.

Endowed belief = 0.3

This proceeds exactly as the above procedure, except now there are 3 white chips (and 7 black chips) in bag A.

Dominant strategy

Considering bag A as the default choice, the clock mechanism is effectively asking the subject, “The number displayed on the screen is the minimum number of white chips in bag B; do you want to switch to bag B now?” The dominant strategy is to indicate the willingness-to-switch by exiting as soon as the displayed number is the same as the number of the white chips in bag A, or one more than the number of white chips in bag A (see footnote 17).

3.3 Treatment design

Each subject participated in two independent elicitation tasks, which occurred in round one and round two, respectively. The second round was a “surprise” as we announced it only after completing the first round.Footnote 19 Table 1 summarizes our two-by-two treatment design.

Table 1 Treatment design

Each session consisted of 4 to 8 participants and a heterogeneous belief environment: half were endowed with beliefs equal to 0.2 and the other half with beliefs equal to 0.3. Subjects were given new instructions in the second round.

3.4 Procedures

All sessions were conducted between April and October 2009 at the Interdisciplinary Center for Economic Science (ICES) laboratory of George Mason University in Fairfax, VA. Subjects were invited via emails sent to a large undergraduate subject pool, and a total of 130 participated. Subjects were paid a guaranteed $5 plus their earnings in the experiment. Average total earnings were $16, and sessions lasted between 30 and 60 minutes. The experiments were partially computerized: we used physical bags and chips to illustrate the lotteries and perform the random draws, whereas the ticking clock was programmed using E-prime.Footnote 20

To facilitate understanding of the mechanisms, subjects were first given abundant time to read the instructions. Following this, the experimenter read the instructions aloud to them. Each subject then took a quiz,Footnote 21 and their answers were recorded. The majority of the subjects correctly answered all questions. The experimenter then announced and explained the correct answers. Throughout the experiment, we did not use words such as “probability” or “percent chance,” and subjects made all decisions in whole numbers.Footnote 22

Also, we generated the random number R (number of white chips in bag B) using the following three steps. In step one, the experimenter showed subjects a deck of nine cards, numbered from 1 to 9. The experimenter then put each card into one of nine opaque envelopes and shuffled the envelopes thoroughly. In step two, each subject was asked to pick an envelope and immediately return it to the experimenter without opening it. At this point, the experimenter wrote the subject’s ID on the envelope. Finally, in step three, the experimenter publicly opened each envelope from a distance (so that no subject could read the numbers inside), transcribed the random number R for each subject ID, and then sealed the envelope. The reason for these steps was to demonstrate that the number R was determined prior to the subjects’ decisions, as well as to make clear that R was an integer between 1 and 9, each with equal probability. For the surprise second round, a new random number was generated for each subject from a new set of nine opaque envelopes, using exactly the same procedures as described above.

Finally, we implemented the payment procedure individually to ensure that subjects knew they were making independent decisions. After the bag of payoff was determined, the experimenter went to a subject with the appropriate bag and chips, which the subject examined. The experimenter put the 10 chips into the opaque cloth bag. The subject then drew one chip from the bag while keeping his or her head turned away. The subject earned $10 if he or she drew a white chip and $1 otherwise.

4 Results

We organize our results in two subsections. The first describes decisions from the first round, and our first result is that novice subjects are more likely to report their endowed beliefs in the clock mechanism. Our second result shows that the subjects use different strategies between the two mechanisms, and that the distribution of clock data more accurately characterizes the distribution of endowed beliefs.

The second subsection presents decisions from the second round, and our results show that the clock and declarative mechanisms are equally likely to elicit endowed beliefs; however, the distribution of clock data continues to reflect more closely the distribution of endowed beliefs.

4.1 Responses from novice participants

With heterogeneous beliefs of 0.2 and 0.3, Table 2 describes individual decisions in the first round.Footnote 23

Table 2 Novice responses: descriptive statistics

Among the 53 and 77 observations in the declarative and clock mechanisms respectively, the proportions of optimal decisions are 47% versus 39%, and non-optimal decisions are 53% versus 22%. In the declarative mechanism, fewer than half of novice responses are optimal. This suggests that the dominant strategies in this environment are not trivial to subjects.Footnote 24 This is especially significant in light of our explicit effort for simplicity and transparency.Footnote 25

Overall, the clock mechanism censors 39% of decisions. In comparison, if the population consisted of only optimal decisions, the proportion would be between 25% and 35%. This suggests that not everybody in our experiment makes optimal decisions.

The mean deviations, as well as the mean absolute deviations from optimal strategies, are significantly different from zero in both mechanisms.Footnote 26 They are smaller in the clock mechanism than in the declarative mechanism, but the difference is insignificant.

We now take a closer look at the two mechanisms by excluding censored decisions. Note that we do not consider alternative approaches that make use of information in censored decisions (see footnote 12). Our first result is as follows:

  • Result 1. With novice respondents, beliefs elicited using the clock mechanism are more likely to be accurate than beliefs elicited using the declarative mechanism.

Evidence

Among elicited beliefs, the proportions of optimal decisions are 64% and 47% in the clock and declarative mechanisms respectively (Fig. 1). A two-sided Wilcoxon-Mann–Whitney test found a statistical difference at p = 0.096 (binary data: 1 if a decision is optimal, 0 otherwise).Footnote 27

Fig. 1
figure 1

Proportion of optimal decisions in first round

However, is the accuracy of the clock mechanism driven by data censoring? If respondents use identical pre-determined strategies in the two mechanisms, then beliefs elicited using the clock mechanism should be identical to beliefs elicited using the declarative mechanism after applying a filter that is equivalent to clock censoring. That is, we should obtain the same data if they are “naturally” censored during the experiment in the clock mechanism or “artificially” filtered after the experiment in the declarative mechanism.

In particular, for each decision in the declarative mechanism, we randomly select a number that is equally likely to be 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, or 0.9. The decision is filtered and discarded if it is strictly greater than the random number. The sample of 53 independent decisions from the declarative mechanism is filtered 10,000 times, and the frequency of each decision remaining is calculated. Doing this yields our second result,

  • Result 2. With novice respondents, filtered declarative data differ significantly from clock data.

Evidence

As shown in Fig. 2, the distribution of filtered declarative data is significantly different from the distribution of clock data (p = .014, Chi-squared test).Footnote 28

Fig. 2
figure 2

Distributions of first-round decisions

In addition, Fig. 2 plots the distribution that would emerge under only dominant strategies.Footnote 29 It has a single mode at 0.3. In comparison, beliefs elicited using the clock mechanism also have a single mode at 0.3, and seem to characterize the dominant strategy distribution reasonably well. In contrast, the mode of filtered declarative data is 0.1, far less accurate regarding the underlying beliefs.

4.2 Responses from one-time experienced participants

We next report an analysis of decisions made in the second round. Our third result is as follows.

  • Result 3. With one-time experienced participants, the declarative and clock mechanisms are equally accurate.

Evidence

Figure 3 shows that proportions of optimal decisions in the declarative and clock mechanisms are 57% and 60% respectively, and these values are not significantly different (p = .84, two-sided Wilcoxon-Mann–Whitney).Footnote 30

Fig. 3
figure 3

Proportion of optimal decisions in second round

Similarly, we apply clock-equivalent filtering to second-round decisions from the declarative mechanism and compare them with the second-round clock data:

  • Result 4. With one-time experienced participants, filtered declarative data differ significantly from clock data.

Evidence

As shown in Fig. 4, the distribution of filtered declarative data is significantly different from the distribution of clock data (p = .052, Chi-squared test).Footnote 31

Fig. 4
figure 4

Distributions of second-round decisions

Combining results 3 and 4, we observe that the proportions of truthful elicitations are not distinguishable in the second round, but the distribution of clock data continues to characterize underlying beliefs more accurately than does the distribution of filtered declarative data.

5 Implementing the mechanisms

The practical design of incentive-compatible belief elicitation mechanisms can be important, for example, to the rapidly emerging literature reporting experiments from large-scale surveys or field experiments (see, e.g., Andersen et al. 2008; Dohmen et al. 2009; Bellemare et al. 2008). Below we briefly discuss how one can use these incentive-compatible mechanisms to elicit beliefs in the framework of the Bellemare et al. (2008) experiment. We chose this experiment for three reasons. First, this experiment uses belief elicitation instrumentally, in the sense that it is part of an investigation of broader economic questions. The mechanisms discussed in this paper may hold particular value for such studies. Second, the Bellemare et al. experiment was conducted online, an environment where extensive training may be impractical. Third, subjects were a large representative sample,Footnote 32 a group which could in principle include mixtures of naïve and sophisticated respondents.

The goal of Bellemare et al. (2008) is to structurally estimate preferences for inequity aversion. Their approach is to combine choice data from ultimatum games (Güth et al. 1982) with beliefs elicited from proposers. In the Ultimatum game, a proposer offers to a randomly matched responder a split of 10 euros. The responder either accepts or rejects the offer; in the case of rejection both players earn zero.

Bellemare et al. elicited proposers’ beliefs by asking, “how many out of 100 people do you think would accept this offer?” Subjects were not rewarded based on the accuracy of their answers to this question.Footnote 33 To pursue an approach based on the incentive-compatible mechanisms studied in this paper, one could instead provide the following straightforward instructions to participants.

You now can draw a ball to earn more money. If you draw a red ball you earn an additional 10 euros; otherwise you earn nothing. You will draw from either Bag A or Bag B. Each bag contains 100 balls, some red and some white. You do not know the exact number, but do know the following. The number of red balls in Bag A is equal to the number of people—out of 100 total people—who played this game and actually accepted the amount you offered to your responder. The number of red balls in Bag B is between 0 and 100, all equally likely.

Declarative Mechanism: Please write down a number between 0 and 100 (any number you like). If the number you write down is smaller than the number of red balls in Bag B, then you will draw a ball from Bag B. Otherwise, you will draw a ball from Bag A.

Clock Mechanism: You will see a number on your screen that starts at 0 and counts up by one each second. The counting will stop when the number reaches its maximum, which is the number of red balls in Bag B. At any point before the counting stops, you can hit the “switch” button on the screen, in which case you will draw a ball from Bag B. If you do not “switch” before the counting stops, you will draw a ball from Bag A.

In any practical application, one must of course first decide between the two mechanisms. Our results indicate the clock holds advantages when analyzing populations that include a mixture of sophisticated and naïve participants. One can be confident in inferences from the clock mechanism in the presence of novice participants, while confidence with the declarative mechanism is greater as the fraction of experienced (sophisticated) participants increases. The reason is that the clock censors noise from naïve respondents and thus improves elicitation accuracy, while the declarative procedure makes use of all of the data. This also suggests that the declarative approach could be advantageous when extensive subject training on mechanism incentives is feasible, or when individual responses are desired by the investigator, such as when eliciting experts’ opinions.

6 Conclusion

In a laboratory study using mixtures of naïve and sophisticated participants, we compared the declarative and clock belief elicitation mechanisms proposed by Karni (2009). These mechanisms are of interest because their incentive compatibility does not require strong assumptions such as risk neutrality or expected utility maximization. We found that, in relation to the declarative mechanism, with the clock mechanism (i) elicited beliefs are more likely to be accurate and (ii) the distribution of elicited beliefs more accurately characterizes the underlying (endowed) beliefs. Our findings complement an auction literature providing evidence that English clocks outperform second-price mechanisms in inducing truth-telling, and have implications for the practical design of incentive-compatible belief elicitation mechanisms.

Despite similarity in accuracy between the two mechanisms with experienced participants, we were surprised to find that differences in decision strategies were apparent. This finding resonates with the benefits of the clock reported in the auction literature (e.g., Kagel et al. 1987; Kagel and Levin 1993; Harstad 2000), raising the important fundamental questions of why and how a clock presentation improves decision making. Distilling the source of the clock’s advantage might allow one to implement procedures to improve decision making in a wide variety of environments, even those that do not easily admit a clock representation of decision alternatives.

A limitation of our study stems from our choice of parameters. We induced beliefs that are nearer to zero than one, and we explained that doing this provides a favorable environment for the clock mechanism. While the clock may perform less well when actual beliefs are closer to the relevant upper bound (e.g., when true beliefs are above 0.5 in our experiment), this is not necessarily a problem in practice. In particular, the investigator is free to choose the clock’s range and increments arbitrarily, and can always include extra ticks at larger values. In doing so, one can be more confident that actual beliefs are near the clock’s starting point and thus minimize data loss.

Future research might investigate whether the truth-inducing advantage of a clock procedure persists in environments with subjective beliefs, or where incentive-compatible mechanisms are difficult to implement. This includes cases where outcomes are impractical or impossible to verify. Of particular interest here are large-scale surveys of respondents’ beliefs regarding life-style choices and consequences related to, for example, changes in health or job status. In addition, recent work by Charness et al. (2007, 2010) suggests that the frequency of mistakes declines when subjects can consult with others. Hence, extending our investigation to environments where decisions are made by groups is a profitable next step to further our understanding of accurate belief elicitation.