Surveys of political knowledge frequently find large partisan differences in reported beliefs about policy-relevant facts. When Democrats control the White House, Democrats are likelier than Republicans to report that economic conditions are improving; the reverse holds true under a Republican administration (Bartels 2002). Similar patterns exist on answers to factual questions about healthcare (e.g., Nyhan 2010), foreign policy (e.g., Jacobson 2010), and social services (e.g., Jerit and Barabas 2012), among other issues. In general, when people are surveyed about facts, they are more likely to report having beliefs that are congenial to their existing beliefs and attachments than beliefs that are uncongenial. While we expect values and preferences to differ in a pluralistic society, we do not expect disagreement over established facts. Factual disagreement fuels concerns about the public’s ability to form preferences that are in line with their values or interests (e.g., Hochschild 2001), hold representatives accountable (e.g., Bartels 2002), and productively deliberate with each other (Shapiro and Bloch-Elkon 2008; Muirhead 2013).

We consider two distinct explanations for differences in survey reports of factual beliefs: motivated learning and motivated responding.Footnote 1 According to the former, when people are presented with information, they are more likely to deduce that the information supports a congenial conclusion than an uncongenial conclusion (Jerit and Barabas 2012; Kahan 2013; Kahan et al. 2017). This bias in learning is thought to stem from the psychological drive to hold internally consistent beliefs that reflect positively on one’s core attachments, such as political party, ideology, or cultural identity (Kunda 1990). Motivated learning is especially concerning from a normative standpoint. It suggests that even when people come across the same information, they can form very different beliefs about the conclusions supported by the information. (Note that learning about conclusions supported by information is different from believing that the conclusions are correct.) Such bias makes reducing gaps in factual beliefs yet harder—mere provision of accurate information is even less likely to work, and may even backfire (see, e.g., Nyhan and Reifler 2010).

However, differences in survey reports of factual beliefs do not always reflect differences in what people believe. Instead, they may be artifacts of the survey response process. Respondents sometimes give congenial but inaccurate answers in response to factual questions even when they have accurate but uncongenial facts at hand (Bullock et al. 2015; Prior et al. 2015). Other times, respondents are ignorant, having no relevant cognitions, and they offer a congenial answer as their best guess (Luskin et al. 2013). In both cases, the survey response process inflates estimates of bias in factual beliefs. We extend this line of thinking to estimates of learning, the process that produces such beliefs. Like studies of stored cognitions, studies of learning may overstate bias if they do not account for motivated responding.

We reassess the extent to which people learn in a motivated manner by simultaneously measuring motivated learning and motivated responding in three separate studies. Building on the design in Kahan et al. (2017)—people are presented data from a social scientific study and asked to report the conclusion supported by the data—we incentivize a random set of respondents to honestly report the conclusion they actually think the data supports. Incentives reduce congeniality bias in reporting the study’s result, suggesting that a portion of what appears to be motivated learning is in fact motivated responding. Correspondingly, we find no evidence that respondents selectively recall the data they were presented. However, there is considerable heterogeneity in responsiveness to the incentive treatment, depending on respondents’ initial position on the issue under study. Moreover, incentivizing respondents to report uncomfortable conclusions supported by a study comes at a price: respondents become more skeptical about the study’s credibility. In all, our findings suggest that motivated learning is less common than what previous work suggests, but that motivated evaluation of information remains common.

Motivated Learning, Reporting, and Interpretation

Psychological motivations are powerful in shaping how people process information. Broadly, people are motivated to both reach conclusions that are accurate and that are congenial to their beliefs (see Kunda 1990). Most research in political psychology focuses on the latter, which Kunda calls “directional” motivation. A great deal of research shows that directional motivations, like partisanship and ideology, bias evaluations of policy arguments (e.g., Lord et al. 1979; Taber and Lodge 2006; Bolsen et al. 2014) and of leaders (e.g., Bartels 2002; Lebo and Cassino 2007; Kim et al. 2010).

Motivated Learning

While most research on motivated reasoning focuses on subjective attitudes, motivated reasoning may also influence factual learning. Suppose, for instance, that there exist some data that support an unequivocal conclusion relevant to public policy. And we ask an individual with an existing opinion on said policy to learn the conclusion supported by the data. The extent to which directional goals outweigh accuracy goals will be correlated with the probability that the individual learns that the data support a congenial conclusion, even if this conclusion is incorrect. We refer to this phenomenon as motivated learning.

Motivated learning has received recent attention. For instance, Nyhan and Reifler (2010) find that partisans ignore corrective information that contradicts their ideological worldview.Footnote 2 Similarly, Jerit and Barabas (2012) find that partisans selectively learn party-relevant factual information. The authors present partisans with facts that reflect either positively or negatively on Democrats or Republicans. They find that partisans are more likely to learn congenial than uncongenial facts, e.g., Democrats are more likely to learn about the success of the Troubled Asset Relief Program than the size of the trade deficit. A major challenge in such experiments is cleanly manipulating information congeniality, while holding constant other attributes, such as topic and difficulty. We now consider a recent study that does just that.Footnote 3

Kahan et al. (2017) cleverly repurpose a “covariance detection task” (see Gilovich 1991) to test whether people engage in motivated reasoning when processing policy-relevant data. Respondents see a \(2 \times 2\) table with data on the relationship between banning concealed carry and rates of crime. The table’s column headings are manipulated so that the data either support the conclusion that banning concealed reduces crime or the conclusion that banning concealed carry increases crime. When asked what result the data support, respondents are more likely to answer correctly when the data support a congenial claim than when the data support an uncongenial claim. That is, liberal Democrats are more likely to report the correct result when banning concealed carry reduces crime than when banning concealed carry increases crime, while the reverse is true among conservative Republicans. Kahan et al. use the terms motivated numeracy and motivated cognition to describe the phenomenon.

The finding is consistent with several other studies that show that people evaluate congenial and uncongenial claims differently, using different evidentiary standards and investing different amounts of effort in processing the information. When evaluating congenial claims, people tend not to be as thorough in searching for evidence (e.g., Kruglanski and Webster 1996; Nickerson 1998), and evaluate available evidence more superficially and less skeptically (e.g., Chaiken and Maheswaran 1994). On the other hand, when evaluating uncongenial claims, people are more skeptical and invest greater processing effort (e.g., Ditto and Lopez 1992; Ditto et al. 1998; Dawson et al. 2002a). For instance, in the covariance detection task, respondents are more likely to only partially consider the data if doing so leads them to think that the data support a congenial conclusion (Dawson et al. 2002b).

Building on this psychological literature, Kahan et al. (2017) argue that our natural tendency to learn from data heuristically (i.e., using mental shortcuts) results in bias. When presented with tabular data, people tend to only consider the most salient datum in the table, for instance, the largest number, to deduce the conclusion supported by the data (Gilovich 1991, p. 31). If heuristic processing yields a congenial answer, people tend to stop processing, concluding that the data supports their beliefs. If heuristic processing instead yields an uncongenial result, people tend to look at the data more carefully to make sure they are correct. For instance, activating directional goals can motivate people to overcome the common error of neglecting ‘cell D’ in a \(2 \times 2\) table (Mata et al. 2015a, b). This imbalance in scrutiny produces a congeniality effect: people learn congenial facts more readily than uncongenial facts. Kahan et al. argue that this effect increases with numeracy, because only respondents with sufficient numerical ability are capable of learning the correct result by considering all four cells.

Aside from asymmetric scrutiny, selective perception may also contribute to motivated learning. A long line of research suggests that expectation structures cognition (Hastorf and Cantril 1954; Bechlivanidis and Lagnado 2013; Kahneman 2013). In our hurry to learn from data, for example, we misperceive data in ways that are consistent with our prior beliefs. When trying to distill information from a contingency table, people may misread the column or row labels in a way that suggests a congenial result. Alternatively, one might misread the numbers in the table. In all, either due to selective perception or an imbalance in scrutiny, people are thought to be more likely to learn correctly when the data are congenial than when they are uncongenial.

Motivated Responding

Motivated responding concerns what survey respondents report on surveys, rather than what they have learned or know.Footnote 4 In particular, it occurs when people with the same underlying beliefs give congenial answers more often than uncongenial answers when asked about their beliefs. In all, it contends that survey responses to factual questions reflect a mix of what people believe and what they wish to be true (Luskin et al. 2013; Prior et al. 2015).

People engage in motivated responding for a variety of reasons. Some deliberately misreport as a way to express their attitudes. For example, a survey respondent who vehemently opposes President Barack Obama may not admit knowing that the unemployment rate declined between 2008 and 2016. The respondent may instead report a rise in unemployment to express their opposition. This kind of expressive self-presentation has been described as “cheerleading” (e.g., Gerber and Huber 2009). Others may misreport just to be consistent within a survey, ensuring that later answers do not contradict their earlier ones (e.g., Sears and Lau 1983; Lau et al. 1990; Wilcox and Wlezien 1993; Palmer and Duch 2001). Yet others may engage in motivated responding to indicate their disbelief in information. For instance, in Kahan et al. (2017), a respondent may pick the congenial answer even after figuring out that the data support an uncongenial conclusion as a way to express their disbelief in the putative data.

In addition to actively misreporting what they believe, motivated responding may take more passive forms. For example, a respondent may withhold their beliefs by selecting “Don’t Know” or skipping a question. Alternately, respondents who don’t know the correct answer may report a congenial answer as their best guess. People may also engage in motivated responding without being consciously aware of it. For example, when asked a factual question in a survey, respondents may scan their memory for a longer time to come up with instances of congenial beliefs than uncongenial beliefs. In all, a variety of reasons explain why people may engage in motivated responding. Our study does not attempt to disentangle these reasons. Instead, our aim is to measure the bias due to motivated responding in estimates of motivated learning derived from ordinary survey instruments.

While both motivated learning and responding fall under motivated reasoning, there are two major differences. First, motivating responding is about the survey response process, as opposed to learning. Motivated responding affects what people say they have learned, not what they actually learn. Second, and relatedly, we suspect that part of motivated responding is just cheap talk that people engage in to publicly protect their core attachments and beliefs. And to the extent that these public pronouncements are shallow, based not in what people deeply believe but what people are prepared to say publicly, these reported “beliefs” are unlikely to shape respondents’ attitudes and behavior. On the other hand, uncongenial beliefs, which respondents are reluctant to express, may nonetheless influence attitudes and behavior. It is therefore important to distinguish between beliefs and instrumental or shallow responses.

A pair of recent studies find evidence that partisans engage in motivated responding on factual questions with political implications. Bullock et al. (2015) and Prior et al. (2015) find that partisans incorrectly report congenial beliefs, even when they know or could have inferred a more accurate answer. Both studies uncover motivated responding by boosting respondents’ accuracy motivation. Bullock et al. (2015) do so by offering respondents bonus payments for correct answers and for admitting their ignorance (by marking “don’t know”), while Prior et al. (2015) use a combination of bonus payments for correct answers, and textual appeals for responding accurately. Importantly, neither of the treatments provide any additional information. Therefore, any change in responses can be attributed to a change in respondents’ motivation, rather than a change in their knowledge. Both studies find that, consistent with motivated responding, accuracy incentives substantially reduce partisan bias—by as much as half.

Because studies of knowledge and learning rely on similar survey instruments, discovery of motivated responding on questions about stored knowledge suggests that estimates of motivated learning may also be inflated. To uncover motivated responding, we borrow a design feature of Prior and Lupia (2008), Bullock et al. (2015), and Prior et al. (2015), offering a random set of respondents small monetary incentives to accurately report their beliefs. To minimize the possibility that incentives affect processing of information, we present information about incentives after respondents have seen the information and can no longer revisit it. These incentives increase respondents’ motivation to honestly report which conclusion they think the data support. Respondents who would ordinarily offer a congenial answer as their best guess or as an act of political expression may be nudged to think more carefully or be more truthful. We therefore hypothesize that this treatment will attenuate the congeniality effect that Kahan et al. (2017) and others observe.

More specifically, we hypothesize that incentives will increase the probability of answering correctly when the correct answer is uncongenial. If respondents learn the uncongenial result correctly, and then knowingly report an incorrect answer, incentives should encourage them to reveal their true belief. If respondents offer an incorrect, congenial response as their best guess, incentives should increase the probability that they will guess more evenhandedly. In both scenarios, incentives should increase correctness. However, we do not expect incentives to significantly increase correctness in the congenial condition, because here the congenial answer is the correct answer. We present our hypothesized pattern of results in a graphical format in Fig. 1. The vertical axis represents the probability of answering correctly, and the dashed lines indicate the size of the congeniality effect. Incentives reduce the congeniality effect by increasing the probability of answering correctly in the uncongenial condition.

Fig. 1
figure 1

Hypothesized pattern of results. Bars indicate probability of correctly reporting study’s result by experimental condition. Dashed vertical lines indicate congeniality effect, i.e., difference in probability of answering correctly between congenial and uncongenial conditions. We hypothesize that incentives will reduce the congeniality effect by increasing the probability of answering correctly in the uncongenial, but not the congenial, condition

Motivated Interpretation

Directional motivations affect not just what people learn and report but also how credible people think a study is. Previous work suggests that people are more likely to question a study’s credibility when its results are uncongenial than when they are congenial (e.g., Lord et al. 1979; Kunda 1990; Ditto and Lopez 1992). This phenomenon likely stems from the more general tendency to spend greater time and effort scrutinizing and refuting uncongenial claims than congenial claims, known as disconfirmation bias (e.g., Edwards and Smith 1996; Taber and Lodge 2006). We therefore expect a congeniality effect on subjective study ratings: respondents will rate the same study as more convincing and well executed when its result is congenial than when it is not. And we expect this congeniality effect to be more pronounced when we incentivize respondents to report the correct study result, because respondents who admit an uncongenial result should be even more likely to report that the study behind the result is unconvincing.Footnote 5

Research Design

To distinguish between motivated learning and motivated responding, we conduct three experiments that build upon the original design of Kahan et al. (2017). In particular, we add an orthogonal manipulation, offering participants a small monetary incentive to accurately report what they have learned. The \(2 \times 2\) design enables us to test whether incentives reduce the congeniality effect, which has been interpreted as evidence for motivated learning. In addition to measuring the outcome used by Kahan et al., we also measure subjective ratings of the study to examine whether the study’s congeniality influences its perceived credibility. Below we elaborate on each of these extensions.

In Study 1, we asked respondents to read a summary of a hypothetical study on gun control. A preface to the study described its purpose: a city government is trying to decide whether or not to ban private citizens from carrying concealed weapons and wants to know if doing so would increase or decrease crime. After the preface, the study was briefly summarized: researchers compared changes in annual crime rates in cities that had banned concealed carry with changes in annual crime rates in cities that had not banned concealed carry. A \(2 \times 2\) contingency table with the study’s putative results came next.

Following Kahan et al. (2017), we manipulated the conclusion supported by the study by switching the column labels. In Table 1, cities that banned concealed carry were more likely to experience a crime decrease relative to cities that did not ban concealed carry. This result can be deduced by comparing the ratios of cities in the first row (75:223 or about 1:3) and the second row (21:107 or about 1:5). Flipping the column labels produces the opposite result—the data now indicate that cities that banned concealed carry saw increases in crime (see Table 2 below).

Table 1 Oppose concealed carry
Table 2 Support concealed carry

After presenting the summary of the study, we asked respondents whether cities with a ban were more likely to experience an increase or decrease in crime than cities without a ban. This question serves as our primary dependent variable. Note that it is strictly factual in nature. It simply asks which of two descriptions is consistent with the data. The question does not ask the respondents to assess a causal claim, evaluate gun control, or indicate their faith in the study. Respondents could not access the study description and table when picking which of the conclusions were supported by the data. At the end of the survey, respondents were debriefed and informed that the data were not real.

To measure motivated responding, we independently manipulated respondents’ motivation to give the answer they thought was correct. We offered a random set of respondents a small nudge, an additional $0.10 for the correct answer. Keeping the amount small has the virtue of not raising respondents’ suspicions—some respondents may take a larger amount as a cue that the uncongenial answer is the right one. As to whether the small nudge is sufficient, we cannot strictly say, but note that Prior et al. (2015) uncover about the same amount of motivated responding by emphasizing the importance of accuracy without any extra money as they do by offering another $1 for the correct answer. To ensure that incentives did not affect how respondents processed the contingency table in the treatment condition, we withheld any information about the incentive until after they had seen the table, and could no longer return to it. The control group was not offered incentives.

After measuring the primary dependent variable, we asked respondents to rate how convincing and how well done they found the concealed carry study. Each rating was measured on a 0–10 scale. We also asked respondents to recall the numbers in the contingency table at the end of the survey, in order to test whether respondents are more likely to remember congenial data than uncongenial data. To minimize respondent disengagement, we offered an additional $0.05 for each number recalled correctly.

In Study 2, we readministered the concealed carry task and added another task following it. In the second task, respondents were presented with a study on the impact of raising the minimum wage. Again, respondents were asked to indicate its result based on tabular data. The minimum wage task was very similar to the concealed carry task in design, with two important differences (aside from the change in topic). First, with the intention of making it easier to learn the correct result, we replaced cell frequencies with percentages in the table. The data suggest that the change had the intended effect, as there was a large increase in the percentage of correct responses. Second, we manipulated the study’s congeniality by switching the row labels instead of the column labels in the table. We believe this is a cleaner manipulation as it holds constant the increase-to-decrease ratio in each row, and simply changes the policy associated with each ratio. While lowering the task’s difficulty might change the congeniality effect observed, we do not expect our changes to affect the degree of motivated responding. Lastly, randomization in the second task was conducted independently of the first, but the sequence of the two tasks was fixed.

In Study 3, we readministered both the concealed carry and minimum wage tasks on a more representative sample. Following our hypothesis that incentives influence responses in the uncongenial condition, we presented all respondents with an uncongenial version of the concealed carry task (based on their pre-treatment attitudes) and randomized incentives as in Studies 1 and 2. This simpler design allows us to conserve resources while testing our central theoretical claim that incentives increase correctness in the uncongenial condition. Additionally, we replicated the full \(2 \times 2\) minimum wage task. The purpose of Study 3 was to gather confirmatory evidence and probe the generalizability of the estimates in Studies 1 and 2.

In each study, in order to identify respondents that would find each study’s result congenial or uncongenial, we measured attitudes toward banning concealed carry and raising the federal minimum wage before the tasks (experiments). We expect respondents who oppose concealed carry to find a decrease in crime associated with a concealed carry ban to be congenial, and an increase in crime uncongenial; we expect the opposite among respondents who support support concealed carry. The same logic applies to the minimum wage task. We measured party identification, political ideology, and demographics prior to the experiments in each study. “Question Wording” in the Supplementary Information (SI) contains a complete description of the three studies and the complete wording of each task and question. In Studies 2 and 3, we omitted recall questions and ratings of the minimum wage study due to concerns about the length of the survey.

Data

We recruited respondents from Amazon’s Mechanical Turk (MTurk) in Studies 1 and 2.Footnote 6 We recruited workers for both studies by advertising a task of completing a short survey on “how people learn.” To assess whether our findings generalize beyond samples recruited on MTurk, we recruited respondents via Qualtrics in Study 3. While it is not a “gold standard,” the Qualtrics sample is more representative of the general population, and respondents appear to be less attentive and detail-oriented than MTurk workers. Study 1 was fielded in December 2013–January 2014, Study 2 in March 2015–April 2015, and Study 3 in August 2016. For details of the recruited samples and how the samples compare to established benchmarks, see SI Table 1 in “Sample Characteristics”.

While none of the samples are nationally representative, we can still learn a great deal from them. Multiple studies find that MTurk samples yield high-quality data and are more heterogeneous and representative than other common convenience samples, such as student samples (Buhrmester et al. 2011; Berinsky et al. 2012; Paolacci and Chandler 2014). And recently, Mullinix et al. (2015) replicate a broad array of experiments across different samples and find that treatment effects are broadly similar across samples. More generally, treatment effects vary across samples when they are strongly conditioned by covariates that vary heavily across samples. We do not expect partisans on MTurk to differ enormously from partisans more generally with respect to motivated reasoning. Multiple studies using MTurk samples find partisan bias in both stored knowledge (e.g., Chambers et al. 2014; Ahler and Sood 2016; Chambers et al. 2015; Bullock et al. 2015) and political judgments (e.g., Arceneaux and Vander Wielen 2013; Lyons and Jaeger 2014; Crawford and Xhambazi 2015; Crawford et al. 2015; Thibodeau et al. 2015). In all, we think it likely that our treatment effects would be similar among a nationally representative sample.

Given the theoretical expectation that we should only observe motivated learning among respondents with sufficient numerical ability to complete the covariance detection task, we screened for high-numeracy respondents using a numeracy quiz in Study 1. The numeracy quiz was composed of the five easiest questions in Weller et al. (2012) (see “Numeracy Screener” in SI for exact items). We invited respondents answering four or more items correctly to participate in the full study. We use a threshold of four because Kahan et al. (2017) find that the median respondent answers four items correctly on the full nine-item scale. In Studies 2 and 3, we invited all respondents to complete the main task, irrespective of numeracy, to ensure that our findings in Study 1 were not contingent on the relatively numerate MTurk sample. In “Respondents Enrolled, Screened, Allocated, and Analyzed” in the SI, we track the number of respondents in each phase of the concealed carry experiment. We find that low- and high-numeracy respondents are similar in terms of party identification, ideology, and demographics (see SI Table 1 in “Sample Characteristics”), but we also present their results separately in “Does Numeracy Condition Treatment Effects?” in the SI.

In Study 1, we recruited 1207 respondents and invited 785 (65% of sample) who passed the numeracy quiz to participate in the full survey. Our main analyses include 686 respondents (87% of screened sample) reporting a position on a concealed carry ban, which is necessary to code congeniality. Of them, 34% opposed concealed carry (i.e., favored ban) and 66% supported concealed carry (i.e., opposed ban). In Studies 2 and 3, we recruited another 947 and 1062 respondents, respectively. Of those indicating a position, similar percentages to those in Study 1 opposed concealed carry: 36% in Study 2 and 40% in Study 3. The vast majority of respondents were in favor of raising the federal minimum wage: 85% in Study 2 and 65% in Study 3.Footnote 7

Results

We begin by presenting results from the concealed carry task, first pooling data from Studies 1 and 2, followed by results from Study 3. We separate out Study 3 because the concealed carry task only included the uncongenial condition. We follow it with results from the minimum wage task, and end with describing the impact of the treatments on respondents’ subjective study ratings.

If people learn in a motivated manner, the percentage of respondents answering correctly when the study’s result is congenial should be greater than when the study’s result is uncongenial. Respondents who oppose concealed carry should be more likely to answer correctly if the data support the conclusion that crime is more likely to decrease in cities with concealed carry bans than in cities without such bans. Among respondents who support concealed carry, the reverse should be true. We thus examine whether the congeniality manipulation increases the probability of answering correctly.Footnote 8

Before analyzing data from the covariance detection tasks, we check to see if partisanship, ideology, and demographics are balanced across the experimental conditions. The average p-value of cross-condition comparisons is .42 in Study 1, .47 in Study 2, .56 in Study 3, and .48 overall (see SI Table 2 in “Covariate Balance”). We also confirm that the first experimental task did not affect behavior in the second (see SI Table 3 in “Assessing Spillover”). In all, the data suggest that randomization was successful, so we move on to analyzing data from the concealed carry experiment.

Figure 2 plots the percentage of respondents answering correctly in the concealed carry task by experimental condition across Studies 1 and 2.Footnote 9 We first consider the percentage correct among concealed carry supporters in the absence of incentives (plotted in a, left). When the result is uncongenial (i.e., pro-ban), only 42.6% of respondents mark the right answer. When the result is congenial (i.e., anti-ban), the percentage increases to 54.6%. Thus, simply changing the study’s result from uncongenial to congenial (by swapping column headers) increases the probability of answering correctly by 12.0 percentage points (b). The pattern is similar among concealed carry opponents in the No Incentives condition (c, left). When the result is uncongenial (i.e., anti-ban), 41.0% answer correctly. When the result is congenial (i.e., pro-ban), correctness increases to 55.5%. The congeniality effect is 14.4 points (d). Thus, in the absence of incentives, the congeniality effects among both concealed carry supporters and opponents is statistically significant.

Fig. 2
figure 2

Concealed carry task results (Studies 1 and 2 pooled). Panels on the left display percentage of concealed carry opponents (a) and supporters (c) correctly indicating study result by experimental condition. Panels on the right display congeniality effect by incentive condition, as well as difference-in-differences (DiD), among concealed carry opponents (b) and supporters (d). Vertical lines indicate 95% confidence intervals. Only respondents who passed the numeracy screener and indicated a position on concealed carry are included (686 in Study 1 and 604 in Study 2)

We next examine the extent to which incentives reduce these congeniality effects, which at first blush appear to be evidence of motivated learning. As we note earlier, the incentive treatment was administered in such a way that it did not affect how respondents initially processed the data–incentives were revealed after the respondents had seen the data and could not go back to it. If we were successful in administering the incentive treatment in the way we intended to, respondents should be as good at recalling the data in the No Incentives condition as in the Incentives condition. We tested this hypothesis with the recall questions at the end of the survey. Data suggest that incentives had no impact on the accuracy of recall (see SI Tables 5 and 6 in “No Evidence of Selective Recall”). Thus, it is unlikely that any treatment effects we see are explained by greater attention to the data when incentives are offered.

Examining Fig. 2b, we see that offering incentives to concealed carry supporters does not reduce bias. The congeniality effect is 13.9 points with incentives, which is almost indistinguishable from the congeniality effect in the absence of incentives (difference-in-differences is an insignificant 1.9 points). Since the congeniality effect remains substantial regardless of incentive condition, it appears that concealed carry supporters indeed learn in a motivated manner.

Data from opponents of concealed carry, however, tell quite a different story (Fig. 2d). Here, it appears that motivated responding masquerades as motivated learning. Incentives lower the congeniality effect from 14.4 points to an insignificant −3.9 points. The difference-in-differences is −18.3 points and statistically significant (s.e. = 9.3, \(p = .05\)). Incentives completely wipe out the bias in answering the question about the study’s result. Moreover, consistent with our hypothesis, the reduction in bias is entirely due to an increase in correctness in the uncongenial condition (16.0 points, s.e. = 6.4, \(p < .05\)), rather than any change in the congenial condition.Footnote 10

Results from Study 3 are similar to results from Studies 1 and 2 for both concealed carry supporters and opponents. Recall that Study 3 only included the uncongenial version of the concealed carry task. Fig. 3 displays the percent correct among concealed carry supporters (a) and opponents (b) by incentive condition. Overall, Study 3 respondents had more trouble correctly identifying the study’s result than respondents in Studies 1 and 2. For instance, only 33.6% of concealed carry supporters correctly identified the uncongenial study’s results without incentives, while 42.6% did so in Studies 1 and 2. More relevant for our purposes, we see that concealed carry supporters are again essentially immune to the incentive treatment. The percentage correct among this group is almost identical when we offer accuracy incentives (33.3%). There is no evidence of motivated responding here.

Concealed carry opponents, on the other hand, once again exhibit a pattern of motivated responding. While only 25.6% answer correctly without incentives, 32.5% answer correctly when offered incentives, resulting in a treatment effect of 6.9 points (s.e. = 4.4, \(p < .06\)). While the magnitude of the effect is smaller than in Studies 1 and 2, it is still non-trivial. In all, the evidence from the concealed carry experiment suggests that motivated responding introduces substantial amounts of bias in estimates of motivated learning.

Fig. 3
figure 3

Concealed carry task results (Study 3). Panels display percentage of concealed carry opponents (a) and supporters (b) correctly indicating study result by experimental condition. Vertical lines indicate 95% confidence intervals

We now turn our attention to the minimum wage task, which we included in Studies 2 and 3, to probe the degree of motivated learning and responding on a different issue, using a slightly different experimental design. The overall percent correct was high in this task (87% in Study 2 and 77% in Study 3), which is unsurprising given our decision to make this task easier on respondents (we replaced frequencies with percentages). Figure 4 summarizes the results from this experiment, pooling respondents in Studies 2 and 3.Footnote 11

Fig. 4
figure 4

Minimum wage task results (Studies 2 and 3). Panels on the left display percentage of opponents (a) and supporters (c) of raising the federal minimum wage correctly indicating study result by experimental condition. Panels on the right display congeniality effect by incentive condition, as well as difference-in-differences (DiD), among opponents (b) and supporters (d). Vertical lines indicate 95% confidence intervals

Our results from the minimum wage task are more consistent with motivated responding than motivated learning, on balance. Among opponents of raising the minimum wage, we see the familiar pattern that opponents of concealed carry display in the first task. The percentage of these respondents correctly identifying the result of the minimum wage study is 62.0% in the uncongenial condition and 89.3% in the congenial condition (a). This dramatic congeniality effect is significantly attenuated by incentives. Specifically, it decreases from 27.3 to 16.3 points, which is a 40% reduction (see b). Again, this reduction is due primarily to an increase in correctness in the uncongenial condition. Incentives increase the percent correct by 8.4 points in the uncongenial condition, but their effect is null in the congenial condition.

Supporters of raising the minimum wage do not behave in a manner consistent with motivated learning (c). In fact, the congeniality effect is a significant −10.7 points without incentives, indicating that respondents are actually less likely to correctly report a congenial result than an uncongenial result. In this case, the theoretical expectation for incentives is unclear, because the bias in the control condition is neither consistent with motivated learning nor responding. We do not expect incentives to significantly alter behavior if there is no bias to reduce. Indeed, incentives do little to affect responses in either the uncongenial or congenial condition, so the congeniality effect remains negative with incentives (d).Footnote 12 Nevertheless, this finding suggests that a congeniality bias in factual learning is yet less common than the literature suggests. In sum, across the two experimental tasks in our three studies, we see some evidence of motivated learning but also quite a bit of evidence for motivated responding.

Biased Study Ratings

Does the congeniality of the hypothetical study’s result affect how positively respondents rate the study? Recall that after the concealed carry task in both Studies 1 and 2, we asked respondents to rate how “well done” and “convincing” they found the study on a 0–10 scale (see “Question Wording” in SI for exact wording). We find that the two ratings are strongly correlated, so we average them into a single rating (Cronbach’s \(\alpha\) = .82).

In Fig. 5, we plot the effect of the congeniality manipulation on this average rating among concealed carry supporters (a) and opponents (b). We estimate the overall congeniality effect and then disaggregate it by incentive condition. Consistent with our expectations, congeniality affects ratings, but more so when we incentivize respondents. Among concealed carry supporters, the overall congeniality effect is a marginally significant .25 (\(t = 1.53\), \(p = .06\)). Drilling down, we find a null effect without incentives and a significant increase of .41 with incentives (\(t = 1.78\), \(p = .04\)). Similarly, among concealed carry opponents, the overall effect is .20, which is in the hypothesized direction but not significant. And again, the effect only appears in the Incentive condition (\(t = 1.41\), \(p = .08\)).Footnote 13

Fig. 5
figure 5

Congeniality effect on study ratings. Figure displays effect of congeniality manipulation on average study ratings, which were measured on a 0–10 scale, among concealed carry opponents (a) and supporters (b). Data are pooled from Studies 1 and 2. Effects are calculated overall and disaggregated by incentive condition. Vertical lines indicate 90% confidence intervals

One possible explanation for this pattern is that when we incentivize respondents to admit having learned an uncongenial fact, they express displeasure via the study ratings. If so, we should see a greater congeniality effect among respondents who correctly report the study’s result than among incorrect respondents. Indeed, we find that the congeniality effect in the Incentive condition only occurs among respondents answering correctly. Among concealed carry supporters answering correctly, the congeniality effect on study ratings in the Incentive condition is 1.05 (\(t=3.31\), \(p < .001\)). Among concealed carry opponents answering correctly, effect in the Incentive condition is 1.39 (\(t=3.43\), \(p < .001\)). These findings are merely suggestive, because answering correctly is of course an endogenous variable.

Discussion

Increasing affective polarization (see, e.g., Iyengar et al. 2012; Iyengar and Westwood 2015) has brought long-standing concerns over motivated reasoning to the fore. These concerns were amply highlighted during the recent presidential election, with numerous stories of people quickly latching on to congenial information, whatever the source, making the rounds. Motivated learning is related to such incidents, but it refers to something even more alarming: people coming across the same information supporting an unambiguous conclusion and yet walking away with different beliefs about what this information supports.

Motivated learning is particularly troubling because it has the potential to upend the benefits bestowed by the information age—easy access to reliable, trustworthy, objective information about a variety of politically relevant topics. Motivated learning means that people may possess different facts even after being exposed to the same information, and may reach preferences that are very different from what they would have if they learned in an unbiased manner (Gilens 2001; Hochschild 2001). Given that motivated learning facilitates disagreement over what the facts are, it also implies that the possibility of democratic deliberation and compromise is slimmer still (Shapiro and Bloch-Elkon 2008; Muirhead 2013).

Our findings confirm that motivated learning occurs in some cases, but also suggest that estimates of motivated learning are upwardly biased. The concealed carry experiment results suggest that supporters of concealed carry likely learn in a motivated manner. Manipulating the congeniality of the data through a minor change has a significant impact on the probability of reporting the correct answer among these respondents. And the accuracy incentives fail to change this tendency. On the other hand, a portion of what is thought to be motivated learning is instead motivated responding. When other respondents are offered a mere $0.10 to report their beliefs accurately, estimates of motivated learning decline sharply. And given that incentives could not have affected how respondents initially processed the data, incentives very likely identify the artifactual component of the evidence for motivated learning.

However, there are other potential explanations for why incentives reduce estimates of motivated learning. The lure of making additional money may cause respondents to choose the answer that they believe the experimenter favors, rather than the one they think is right. Or, respondents may take monetary incentives as a cue that the congenial answer is incorrect. In both cases, the decline would be artifactual. We explore both possibilities, finding little empirical support for either (see “Assessing Experimenter Demand Effects” section in SI, and in particular, SI Tables 9 and 10). On balance, the data suggest that incentives reduced bias in estimates of motivated learning, rather than increase it.

It is possible that the data, even accounting for incentives, still overstate the extent to which people learn (or mislearn) in a biased manner. It is likely that a non-trivial proportion of respondents in the experimental tasks simply tune out because they find the wording too complex, or because they are disinterested in the question. Such respondents may pick an answer by taking a blind guess or by relying on their priors. They are aware that they haven not really learned the result supported by the study. Other respondents may use cheap heuristics to deduce the correct result. And some of these respondents also likely know the fallibility of doing so, and prorate their certainty in their answer accordingly. A simple correct/incorrect scoring does not capture either of these concerns, instead treating each answer as evidence of learning a particular result. To assess these concerns, we asked people how confident they were about the answers they gave after they had selected them in Study 3. Only 13% of respondents are certain of their answer in the concealed carry task without incentives. Even fewer, 10%, are certain and incorrect. This suggests that the proportion of people who become confidently misinformed due to motivated learning, which is the gravest concern, is not very large.Footnote 14

Two other pieces of evidence suggest that motivated learning is less common than what conventional estimates reveal. Firstly, we find that respondents recall the data in a largely unbiased manner. Secondly, among those who support increasing the minimum wage, the congeniality effect is negative—respondents were more likely to report the correct result in the uncongenial condition than in the congenial condition. This is exactly the opposite of what the theory of motivated learning would predict, which suggests that motivated learning may not be a very common feature of learning more generally.

Lastly, data from the minimum wage task suggest that when the task is made easier, motivated learning all but disappears. This may happen because when the truth is transparent and easy to grasp, even people who are otherwise prone to motivated reasoning have trouble denying it (see also Bisgaard 2015, who shows that when economic conditions are unambiguous, as in the 2008 recession, partisans’ factual beliefs about the economy converge). The finding is consistent with bounded rationality, a mechanism proposed for motivated learning. When little effort is required, motivated learning is perhaps not as much an issue. This also suggests that treatments designed to teach people how to infer data correctly from the contingency table ought to prove efficacious. So should treatments that give people more time to learn, and incentivize attention.

More work is needed to understand why incentives reduced the congeniality effect among those who oppose concealed carry, and not among those who support it. One possibility is that supporters of concealed carry have a stronger affective reaction to the issue and experience a greater directional pull toward their preferred conclusion. Another possibility is that this group of respondents differs on certain traits, such as need for cognition, that likely affect the extent to which people learn in a biased learning (e.g., Arceneaux and Vander Wielen 2013).

The fact that incentives reduce estimates of motivated learning has implications for survey measurement. It suggests that in order to precisely measure certain variables, such as factual beliefs with partisan implications, accuracy incentives may be the best way to obtain unbiased estimates. In extending this method to experiments on learning, our study contributes to the burgeoning literature on motivated responding, which combines the insights of survey satisficing with literature on group-based reasoning and affect (Bullock et al. 2015; Prior et al. 2015).

To put the results in perspective, however, note that estimates of motivated learning and motivated responding come from a unique task. The task of interpreting a contingency table is unusual for many people. It is also the case that the data in the table is arranged so that common heuristics (e.g., focusing on the upper-left cell) always lead to the wrong answer. While news media occasionally present studies using tabular or graphical formats, it is not the norm. And in such cases, it is likely that training people on how to interpret tables or incentivizing how attentively they process the data will significantly reduce bias.

Finally, the results suggest that incentivizing accurate reporting among respondents comes at a price. Respondents rate the study as less well done and less convincing when incentivized to report the uncongenial fact. Thus, while it is possible to increase accuracy in reported factual beliefs, doing so may increase bias in interpretations of these facts. Our finding is consistent with other research showing that even when people agree on factual information, they nevertheless tend to interpret the information in a motivated manner (e.g., Gaines et al. 2007; Bisgaard 2015). While the observed reduction in motivated learning is welcome news for the prospect of democratic accountability, reducing differences between groups in one place may increase differences between them in another place.