Introduction

The problem of free will is one of the oldest in philosophy. Our intuitive sense that we have the capacity to decide between two or more alternative courses of action has provoked a diversity of interpretations. This is unsurprising because of the connections between our intuitive sense of free will and concepts of moral responsibility, just reward, individual autonomy and dignity, genuine love or friendship (O’Connor 2016), often in conjunction with social identity or specific religious beliefs (Baumeister et al. 2010). Enlightenment-era Scottish philosopher David Hume is largely responsible for crystallizing the issue of free will into two primary positions: compatibilism, which posits that a deterministic universe is consistent with freedom of choice, and incompatibilism, which holds that determinism and free will are inherently at odds (Hume 1975). Within this framing, the question of free will is concentrated on whether we have a particular kind of causal control of our decisions and actions. Hume’s conviction that free will and determinism are ultimately compatible has been interpreted in different ways, but one of the most straightforward interpretations hinges on Hume’s definitions of “liberty” and “necessity”. If free and responsible action must necessarily be caused by an agent, then it must be compatible with a deterministic basis for cause and effect (Russell 2016; Hume 1975).

The psychologist William James, by contrast, famously called compatibilism “a quagmire of evasion” and struggled for much of his life to reconcile his understanding of a law-governed, deterministic universe with the subjective experience of having free will (James 1884). Despite developments in our understanding of determinism (Earman 1986; Ismael 2016) and a rich set of distinctions that reveal a complex set of phenomena associated with our intuitive sense of free choice (Kane 2002; O’Connor 2016), a number of contemporary researchers have questioned whether we have free will at all (compatibilist or incompatibilist). Empirical studies in neuroscience and cognitive psychology have been central to this line of questioning. Experiments designed to ascertain the relationship between our awareness of a conscious decision and the brain activity involved in carrying out an associated physical action seem to suggest that we do not have conscious control of the initiation of our actions and our impression that we do is illusory (Libet 1985). However, there is little consensus on how to interpret these experiments (Mele 2009).

“Experimental philosophy” has emerged recently as a field of inquiry that seeks, among other things, to understand whether and to what extent the intuitions of everyday people align with those used by philosophers to formulate key distinctions, such as compatibilism versus incompatibilism. The field makes use of empirical data and experimental methods to investigate regularities and variation in the attitudes and behaviors of everyday people to shed light on the psychological processes and experiential circumstances that predict beliefs and interpretations about key concepts. Some of its findings have been surprising. On a number of the issues that have been investigated, lay intuitions conform poorly to the space of possibilities formulated within traditional philosophy. Several studies have found that individual differences in personality predict philosophical disagreement to a surprisingly large extent, prompting some researchers to speculate that the heritability of these traits may help explain the persistence of unresolved philosophical disputes (Feltz and Cokely 2012).

Free will has been a salient topic in these endeavors. For example, experimental philosophers have asked questions about the relationship between moral responsibility and free will by surveying the general public on what they actually believe (Nahmias et al. 2005), or by seeing whether these beliefs show significant differences in culture (Weinberg et al. 2001) or even gender (Adleberg et al. 2014). Although these studies have illuminated important patterns in how people conceptualize free will under certain conditions and change their conception in light of specific factors, no unifying perspective has been identified. The empirical evidence does not provide unqualified endorsement of whether most people adopt compatibilist or incompatibilist orientations to the concept of free will (Chan et al. 2016), or even whether this distinction is relevant to what most people believe about free will and determinism.

While psychology and philosophy have been wrestling with the concept of free will for centuries and millennia (respectively), behavior genetics provides a novel approach that can address questions about how people perceive free will and what factors contribute to their perceptions. Implicit in traditional debates about “nature vs. nurture” is the empirical finding that features of our biological constitution and features of our social and developmental experience make differential contributions to human behavior (Tabery 2014). Studies of monozygotic twins—a standard methodology utilized in behavior genetics—have shown for decades that twins can exhibit uncanny similarities as a direct consequence of genetic similarity. These studies also tend to engender a fascination with free will and determinism, especially as misunderstandings about the nature and meaning of heritability are thought to sometimes foster maladaptive social attitudes (Gericke et al. 2017). Intriguingly, the relationship between our intuitions about free will and knowledge about the heritability of human traits has remained largely unstudied. For example, on the standard framing of compatibilism versus incompatibilism, a deterministic universe would have the same logical implications for free will regardless of whether genes or environment are more responsible for determining our behavior. And yet it seems plausible that people might have different intuitions about how these types of causal factors are related to the concept of free will.

Perhaps just as interesting is the fact that the small number of existing studies on attitudes about genetic determinism have generally considered only magnitude rather than accuracy of these estimations. When examined through the lens of modern behavior genetic findings, multiple meta-analyses and other large-scale twin studies present the opportunity for comparing lay estimates of genetic contributions to different human traits with their empirical heritability estimates. Various public surveys have been conducted to assess lay knowledge about genetics (e.g., Carver et al. 2017), but have not examined individual differences in accuracy in the context of causes and consequences of genetic determinism or empirical heritability estimates. Conversely, Keller’s Genetic Determinism Scale (2005) characterizes high scores on genetic determinism as being associated with “prejudice and in-group bias”, but does not consider whether different estimates could be considered a more or less accurate way of viewing reality. The existing literature has also left open the question of other potential causes and consequences of more or less accurate perceptions of the genetic contribution to different traits in everyday life. While existing research indicates that laypeople do incorporate knowledge about genetics into their understanding of human behaviors and motives (Condit et al. 2006), little is known about the factors that influence the accuracy of an individual’s perception. For example, is the tendency to ascribe an accurate balance of genetic and environmental influence to various behaviors something that can be learned over time, or is this capacity itself innate? Is general accuracy associated with variables other than education about genetics? In a recent study of primary school teachers in the United Kingdom and their beliefs about the influence of genetic and environmental factors on educationally relevant traits, more accurate beliefs were found to be modestly associated with having taught older children (Crosswaite and Asbury 2018), providing a tantalizing but limited suggestion that experience may in part guide accuracy about perceptions of the role of genetics in human behavior.

These unexplored questions suggest that an approach informed by findings from behavior genetics is uniquely equipped to answer two important questions. First, how do attitudes about free will and determinism relate to what people believe about the genetic and environmental contributions to human behavior, and how well do these beliefs align with empirical findings from published heritability estimates? Second, what influences do genetic and environmental factors actually have on beliefs about free will and determinism?

The present study explores answers to the first question by surveying two independent samples of participants via Amazon Mechanical Turk on several existing measures of belief in free will and determinism, political orientation, religiosity, and a suite of demographic questions including years of education, age, number of children, and marital status. Of particular interest is whether and to what extent beliefs about free will and determinism, including on novel subscales that discriminate between genetic and environmental determinism, relate to lay estimates of the genetic and environmental contributions to 21 human traits. These include both physical and psychological traits in both abnormal and normal dimensions, such as height, schizophrenia, intelligence, and sexual orientation. The intention of surveying these judgments is to ascertain how individuals with little or no genetics education perceive the relative contribution of genes and environment to individual differences in these traits, and how these judgments relate to other measured variables. This positions us to probe questions about the formation and consequences of these beliefs, as well as to ascertain how accurately lay estimates of genetic influence on these traits reflect published estimates from meta-analyses and large-scale twin studies. For example, do people tend to ascribe consistently different proportions of genetic and environmental influence to different types of traits; e.g., behavioral versus physical, or normal versus abnormal? Some of the answers to these questions are predicted to be informative in addressing the second question, which is currently under investigation in a large sample of adopted siblings from the Minnesota Center for Twin and Family Research.

Methods

Sample and demographics

Amazon Mechanical Turk was used for recruiting subjects to complete an online questionnaire for monetary compensation in two studies across several weeks. Mechanical Turk (MTurk) is an open online marketplace where participants (“workers”) can choose to complete a “human intelligence task” (HIT), which has been created by businesses or researchers (“requesters”), using an online platform for data collection and compensation. MTurk has been found to produce social science data that are at least as reliable as those obtained through traditional methods (Buhrmester et al. 2011). Data quality can be further enhanced by restricting and filtering on certain criteria, as more rigorous exclusion methods for MTurk samples have been found to bolster statistical power (Thomas and Clifford 2017). In the present study, response validity was optimized by requiring participants to have completed at least 100 previous HITs on MTurk, at least 90% of which must have been approved as valid by the HIT’s requester. This helps to ensure that all participants have a requisite amount of valid experience using MTurk. Participants had to be at least 18 years old and located in the United States.

Data collection was initiated originally as a pilot study to help clarify the utility of different free will and determinism scales, including both novel measures of attitudes about genetic contributions to behavioral traits and the relationships of these to demographic and personality criteria (see “Measures”), in order to validate these measures for possible use in a subsequent adoption study. This initial sample recruited approximately 300 participants via MTurk and these participants were surveyed on all measures reported in the current study. A larger second sample of approximately 800 participants was recruited subsequently through MTurk, and this sample was surveyed only on the “lay estimates of genetic influence on traits” (see “LEGIT” under “Measures”) and the demographic items. The rationale for the larger but truncated second sample is twofold: first, a larger number of participants was thought to be useful for examining the internal patterns and relationships of the novel LEGIT items at higher resolution while requiring each participant to spend relatively little time answering the items. Second, the recruitment of an additional sample enabled us to make one change in a demographic question to ask the number of children, rather than just whether or not each participant has children.

The recruitment method and required criteria were identical across both samples. The randomly-assigned identifier number matched across both samples for total of 68 participants, and these individuals’ responses on the second survey were removed from the final analyses, netting a total of 1041 unique participants across both samples. The initial sample (Sample 1) consisted of 301 unique participants (42.9% female) with 50.8% falling within the “25–34” age range. The second sample (Sample 2) consisted of 740 unique participants (48.2% female) with 46.2% falling within the “25–34” age range. Both samples responded to a demographic section that assessed age range, gender, marital status, educational degree, years of education, working status, approximate number of hours worked in a week, and political views. The demographic portion of the survey administered to Sample 2 included additional questions on the range of household income and the number of children for each participant. Demographic distributions were similar for most variables across the two samples. Frequency data for each demographic criterion is summarized for both samples in Table 1.

Table 1 Percentages for both samples on a selection of demographic variables

A Chi square test of independence was performed across samples for each demographic criterion listed in Table 1 that was shared across samples. In cases where bins contained very few participant responses (e.g., “other” for gender), the Chi square test was repeated without inclusion of the small bin(s). The response frequencies for each criterion were non-significantly different across the two samples, though “Do you have children? (yes/no)” was marginal (χ2 (1) = 6.42, p = .011), reflecting the greater percentage to report having children in Sample 2 (44.3%) than in Sample 1 (35.3%).

Measures

In addition to the demographic questions described above, Sample 1 (N = 301) evaluated participants on a total of 184 questions assessing beliefs about the causes of human behavior, personality, religion, knowledge of basic genetics, attitudes toward current affairs, and beliefs about the genetic and environmental contributions to variation in 21 human traits. Sample 2 (N = 740) evaluated participants only on beliefs about contributions to variation in the same 21 traits in addition to the demographic questions. All scales used are described in the next section.

Free will and determinism scales

Participants in Sample 1 answered a series of 69 questions on their opinions about the causes of human behavior. Questions were extracted from two well-known measures of human agency. Descriptive statistics for all measures of free will and determinism used in Sample 1 (N = 301) are shown in Table 2, along with comparisons of alpha reliability between published scales and the altered or truncated scales in the current sample. Sample reliability for each scale and subscale generally compared favorably to full-length, published alphas. Mean scores for each scale are constructed from the same 5-point Likert scale. Alpha reliabilities for all scales were robust, generally meeting or exceeding published alphas where available.

Table 2 Descriptive statistics and alpha reliabilities for each scale and subscale used for measuring attitudes on the causes of human behavior

The Free Will and Determinism Scale-Plus (FAD+; Paulhus and Carey 2011) is one of the most widely used self-report measures of free will/determinism. It consists of 27 items on 4 subscales: (1) Free Will (FW; 7 items, alpha = 0.70), (2) Scientific Determinism (SD; 7 items, alpha = 0.69), (3) Fatalistic Determinism (FD; 5 items, alpha = 0.82), and (4) Unpredictability (UNP; 8 items, alpha = 0.72). The scales are minimally intercorrelated (all r < .20 in absolute value) and related to the Big Five (largest r is between Neuroticism and Fatalistic Determinism, r = .22, p < .01; see Table 4 of Paulhus and Carey 2011). Paulhus and Carey’s Scientific Determinism scale includes two items concerning genetic/biological determinism (SD-BIO), two items concerning environmental/psychosocial determinism (SD-ENV), and three items concerning general determinism (SD-GEN). We have added nine new items (three for SD-BIO, three for SD-ENV, and two for SD-GEN) to create three 5-item subscales. In generating the items, we have attempted to make the Genetic and Environmental subscales as parallel as possible by mirroring the wording of the questions closely. For example, a new item added for SD-BIO, “The genes you inherit will determine your success as an adult”, is formulated as a direct parallel to the preexisting Paulhus and Carey (2011) SD-ENV item: “Childhood environment will determine your success as an adult”. The current study represents the first pilot testing of these new items. Alpha reliabilities for the new 5-item subscales in the pilot sample are shown in Table 2. All items are keyed positively, and responses are given on a 5-point scale with each point anchored (Strongly disagree = 1, Disagree = 2, Neither agree nor disagree = 3, Agree = 4, Strongly Agree = 5).

The Free Will Inventory (FWI; Nadelhoffer 2014) consists of three 5-item scales: (1) Free Will (FW), (2) Determinism (DET), and (3) Dualism/Anti-Reductionism, which was not used in the final survey. Our sample yielded an alpha reliability of 0.87 and 0.82 for FW and DET, respectively. All items are keyed and anchored identically to the FAD+ items.

The Belief in Genetic Determinism scale (BGD; Keller 2005) consists of 18 items in its full scale; we chose to include the 10 with the highest factor loadings in order to make it as parallel as possible to the Belief in Social Determinism scale so that one view of determinism (environmental/social versus biological/genetic) would not be over- or underrepresented in the number of items included in the survey. Our 10-item truncated scale produced an alpha reliability of 0.85 (Table 2). All items are scored on a 5-point “agreement/disagreement” scale.

The Belief in Social Determinism scale (BSD; Rangel and Keller 2011) consists of 12 items in its full scale; again, we have chosen to include the 10 with the highest factor loadings to make it parallel to the BGD scale. Our 10-item truncated scale produced an alpha reliability of 0.80 (Table 2). All items are scored on a 5-point “agreement/disagreement” scale.

Current affairs and religion

In addition to the single demographic question asking political orientation, three scales are used as measures of political, religious and social attitudes:

Authoritarianism (SL-A; Duckitt et al. 2010) consists of 18 items that represent three different facets; this is the “short form” of the 30-item measure. As presented, items 1–6 represent Authoritarian Submission; 7–12 represent Conventionalism; 13–18 represent Authoritarian Aggression.

Egalitarianism (SL-EG; Feldman and Steenbergen 2001; Feldman 1988) consists of a combination of eight items from two similar scales. Five of these items were used in a Minnesota Twin Registry survey from 2008. The additional three items were included to augment coverage of “equality of opportunity” rather than “equality of outcome,” which the 2008 measure has been criticized for lacking.

Religiousness consists of nine items drawn from assessments used for the Minnesota Center for Twin and Family Research. The items are straightforward questions that ask about the frequency and importance of a variety of behaviors related to observance of religious holidays, reading religious texts, and salience of religious activity in family and everyday life.

Genetics literacy and lay estimates of genetic influence on traits

Genetics literacy was assessed from a total of 10 items, eight selected from the Public Understanding and Attitudes towards Genetics and Genomics (PUGGS; Carver et al. 2017) questionnaire and two items used to assess public knowledge on genetics and genetic testing from Haga et al. (2013). The full-scale PUGGS questionnaire was constructed to assess the knowledge of college students about genetics and genomics, and consists of 45 items developed and reviewed by international experts from genetics, education, and other fields.

Lay estimates of genetic influence on traits (LEGIT) were assessed from an adaptation of a section of the PUGGS questionnaire called the “table of traits”. We chose to revise the original PUGGS section in order to give more balanced coverage of physical, medical, and behavioral traits, as well as to limit the surveyed traits to those with some significant coverage in the behavior genetic literature. Our revised survey section is presented as a table of 21 human traits, including representatives of normal and abnormal physical traits and normal and abnormal psychological traits, with the following instructions:

People vary in traits (physical features, behaviors, diseases and disorders) such as those shown in the table below. Both genetic factors and environmental factors contribute to differences among people. Environmental factors can for example include culture, upbringing, eating habits and exposure to pollution. For each of the characteristics below, indicate to what extent you think genetic and environmental factors contribute to differences among people.

The original PUGGS table included 17 items plus one example item (eye color); we dropped four of the original items (interest in fashion, addiction to gambling, asthma, and religious beliefs) and added seven new items (obesity, personality, blood pressure, athleticism, heart disease, musical talent, and sexual orientation). The wording of two PUGGS items was changed (“intelligence in adults” became “intelligence”, and “severe depression” became “depression”, in the current survey), and the remaining 12 were kept as-is (eye color, blood group, color blindness, height, bipolar disorder, schizophrenia, attention deficit/hyperactivity disorder (ADHD), breast cancer, diabetes, alcoholism, violent behavior, and political beliefs) for a total of 21 items. Responses are keyed on a 1 to 5 scale (Only environmental factors = 1, mainly environmental factors = 2, Genetic and environmental factors contribute roughly the same = 3, mainly genetic factors = 4, Only genetic factors = 5).

Analysis

All statistical analyses were conducted in SPSS and R. Reported subscale measures are all mean values of scores unless otherwise indicated. Due to the large number of variables and generally small effect sizes, significance is established at p < .01 unless otherwise indicated.

In evaluating LEGIT responses for explanatory clustering, varimax-rotated principal component analysis was used to highlight any possible similarities among responses on 21 traits. This use of principal components analysis is analogous to its use in population genetics to cluster individuals into populations (Patterson et al. 2006). Some prefer to use factor analysis for this purpose of clustering items, but the choice between principal components and factor analysis is unlikely to alter substantive conclusions about the data (Velicer and Jackson 1990).

Results

Lay estimates of genetic influence on traits form distinct clusters

Lay estimates of genetic influence on traits (LEGIT) were similar across both samples (Table 3). A Chi square test of independence was performed across the two samples for each trait to assess any significant differences in the frequency distribution of responses. Response frequencies for each trait were non-significantly different between the two samples.

Table 3 Mean and standard deviation for the lay estimate of genetic influence on 21 human traits

Participants in both samples rated differences in height, eye color, blood group, and colorblindness as the traits most strongly influenced by genetic factors, with means falling above 4.0 and the highest for eye color (combined sample M = 4.65, SD = 0.71). Political beliefs were estimated to have the smallest genetic contribution in both samples, and was the only surveyed trait whose mean fell below 2.0 (combined sample M = 1.70, SD = 0.86). Sexual orientation and political beliefs had the largest spread in both samples, with standard deviations approaching or exceeding 1.0.

Due to the similarity of means and variance across both samples, the combined sample (N = 1041) was used for evaluating LEGIT intercorrelation. Responses across traits were moderately intercorrelated, and exploratory principal components analysis (PCA) was used to evaluate whether the data suggest the existence of distinct clusters of traits, where members of the same cluster tend to be rated similarly. A heat-map correlation matrix of each trait included in the LEGIT items is shown in Fig. 1.

Fig. 1
figure 1

Heat-map correlation matrix showing the patterns of intercorrelation that emerged among lay estimates of genetic influence of 21 human traits

A scree plot suggests that the first four components of the PCA are most likely to have meaningful explanatory power for variance in LEGIT scores. Together, these four account for 48% of the variability in responses, with each accounting for a similar proportion (11–14%). The decision to limit this analysis to the first four components was based on joint evaluation of eigenvalue size, parallel analysis, optimal coordinates, and visual inspection (see Supplementary Fig. 1 for scree plot and additional justification for this decision), as well as the readily interpretable composition of these first four components once orthogonally rotated to bring their differences into focus. (Although rotation of the axes means that the resulting components are technically no longer the principal components maximizing the variance of subject scores, varimax does facilitate the identification of trait clusters. For simplicity, we will continue to use the terminology of “principal component scores” and the like.) Varimax rotation produced four intuitively-related groups of traits. Height, eye color, blood group and colorblindness clustered together into a group conceptually united as physical traits of the human body. Intelligence, personality, musical talent, violent behavior, and athleticism clustered together into a psychological attribute group. Diabetes, alcoholism, obesity, blood pressure, and heart disease formed its own distinct cluster that we labeled as lifestyle attributes. Bipolar disorder, schizophrenia, depression and attention deficit/hyperactivity disorder (ADHD) formed a clear group of psychiatric traits, to which sexual orientation clustered unequivocally (Table 4).

Table 4 Varimax-rotated component matrix showing each of the 21 traits’ association with four extracted components

Two traits did not cluster interpretably into one of these four groups. Breast cancer does not have a single predominate association with one component (loading on psychiatric trait component: 0.32; loading on lifestyle trait component: 0.35). LEGIT scores for political beliefs are negatively associated with the physical traits component (–0.58), and is the only surveyed trait with a predominately negative association. Regression scores for these four components were estimated and assigned to each participant for use as predictor variables. Political beliefs and breast cancer were omitted from this estimation due to the lack of objective interpretability of their associations. Removing these two traits from the PCA changed each trait’s association with the four components very little, and boosted total explained variance from 48 to 51% (see Supplementary Table 1 for revised rotated component matrix with political beliefs and breast cancer removed).

Lay estimates of genetic influence and social attitudes weakly predict beliefs about agency

Beliefs about free will and determinism were moderately interrelated. Participants surveyed on the free will and determinism scales (Sample 1, N = 301) generally had higher mean scores on the two measures of free will than on any of the measures of determinism or essentialism, indicating greater tendency to endorse items that reflected the belief in agency and efficacy (Table 5). The two free will scales (FWI: FW, FAD+: FW) were significantly correlated, and all subscale measures of determinism were generally all positively related to one another and negatively related to measures of free will. Largest correlations among determinism scales were observed between BGD (genetic determinism) and FAD+: biological determinism, between BGD and FAD+: scientific determinism, and FAD+: fatalistic determinism and FWI: determinism. All significant associations between measures of free will and determinism were negative and weak.

Table 5 Descriptive statistics and matrix of correlations among each surveyed measure of free will and determinism (N = 301)

Free will and determinism scales related modestly but significantly to distinct clusters of lay estimates of genetic influence on trait (LEGIT) scores (Table 6). Both free will scales (FAD+ and FWI) were positively associated with high LEGIT scores exclusively on the physical trait factor, and both primary determinism scales were negatively related to this factor. The psychological trait factor was positively and significantly associated with the determinism subscales of the FAD+ and the FWI and with the BGD, but not with either measure of free will. All significant relationships with trait clusters were weak to moderate, the strongest being the BGD with the psychological trait factor. Both psychiatric and lifestyle LEGIT factors did not relate significantly to any subscale measures of free will or determinism.

Table 6 Correlations and p-values of select measures of free will and determinism with the four components of genetic influence estimates

Social attitudes were surveyed in the domains of political orientation, authoritarianism, egalitarianism, and religiosity. Political orientation was assessed on a 1–5 scale in a single item across both samples (N = 1041). Overall, participants were more likely to identify as liberal than conservative (M = 3.34, SD = 1.14). Scores on authoritarianism and egalitarianism correlated with one another at r = –.64 (p < .01), with political orientation (authoritarianism r = –.65 [more conservative], egalitarianism r = .64 [more liberal], both p < .01), and with religiousness (authoritarianism r = .51, egalitarianism r = –.21, both p < .01).

Table 7 provides an overview of the association of each measure of social attitude with measures of free will, determinism, and the four LEGIT factors. Significant relationships are all generally small. Though modest, the association between measures of free will and both religiousness and authoritarianism is consistent and positive, with higher authoritarianism scores and higher religiousness scores predicting stronger beliefs in free will. Belief in genetic and social determinism (BGD and BSD) correlated with authoritarianism in opposite directions, with higher authoritarianism predicting higher BGD scores. The relationships of political ideology and egalitarianism with measures of free will and determinism were generally weaker than the comparable correlations for authoritarianism (Table 7).

Table 7 Correlations and p-values of political ideology, authoritarianism, and egalitarianism with free will and determinism subscales and components of lay estimates of genetic influence on traits

Factor scores for both the psychiatric and psychological trait clusters correlated moderately with political ideology, with liberals more likely to endorse high genetic contributions to psychiatric traits and conservatives more likely to endorse the same for psychological traits. Religiousness had a unique negative association with the psychiatric factor. Though modest in size, the pattern that emerges is consistent: Conservative and authoritarian (and to a lesser extent religious) attitudes tend to be more strongly associated with stronger beliefs in free will, higher estimates of the genetic influence on psychological traits, and lower estimates of the genetic influence on psychiatric traits. Liberal and egalitarian attitudes are typically associated with the opposite. Supplementary Fig. 2 provides a visual representation of the comparably strong political influence on the psychiatric and psychological clusters by comparing standardized factor scores.

Political orientation was also found to have a unique influence on one LEGIT trait in particular: sexual orientation, with liberals estimating significantly greater genetic influence. The relationships between each trait and political orientation, authoritarianism, and egalitarianism are explored further in the supplementary material (Supplementary Table 2), as well as in an independent-samples t-test between all conservatives and liberals (Supplementary Table 3). This latter comparison produced the largest significant difference in mean estimates of sexual orientation (t(772) = − 8.57) of all surveyed traits, supporting the finding that conservative and liberal estimates of the genetic influence on this trait are greatly divergent.

Though the varied and sometimes nonintuitive relationships among these complex constructs may appear to further complicate the question of free will and determinism, certain patterns do emerge when multiple regression is employed. With predictors including all demographics, scores of authoritarianism, egalitarianism, religiosity, and scores on the four trait factor groups, free will (both scales) is significantly (p < .01) predicted only by years of education (β = − 0.23), authoritarianism (β = 0.55), and the physical LEGIT component (β = 0.51), with an overall (adjusted) model fit of R2 = .23. For deterministic beliefs (both main scales), the strongest significant predictors were the physical trait component (β = − 0.35) and the psychological trait component (β = 0.26), with weaker influences of age (β = − 0.24) and the lifestyle trait component (β = 0.19), and with an overall (adjusted) model fit of R2 = .18. Religiosity and egalitarianism were nonsignificant for both outcomes, as was the psychiatric trait component and all other demographics listed in Table 1.

How accurate are lay estimates of genetic influence on traits?

The field of behavior genetics has generated empirical heritability estimates for most of the traits surveyed, enabling a novel exploration of accuracy in lay estimates of genetic influence on these traits. This allows us to investigate two important questions. First, what is the correspondence between lay estimates and published heritability estimates for these traits, most of which have been studied directly in large twin samples and meta-analyses? Second, what variables are significantly associated with individual differences in accuracy of these assessments?

In Table 8, the mean estimate of genetic influence for each trait in the combined sample (N = 1041) is displayed alongside the estimate of heritability for the comparable trait in published behavior genetics literature. LEGIT scores for each participant on each surveyed trait were transformed to the same scale as the published estimates, where 0 for heritability variance is equivalent to “only environmental factors” (a 1 on the survey) and 1 is equivalent to “only genetic factors” (a 5 on the survey). “Genetic and environmental factors contribute roughly equally” (a 3 on the survey) is taken to be a functional equivalent to stating that 50% of the variance in a trait is due to genetic factors, and this is converted to an estimate of 0.50 on a “heritability” scale. Most published estimates are taken from the 2015 meta-analysis by Polderman et al., which documents the results of 50 years of twin studies on over 17,000 separate traits. The name of the comparable trait used in the published literature is displayed alongside the name used in the LEGIT table of traits from the current study. In several cases including political beliefs, violent behavior and obesity, the meta-analytic heritability estimate represents a broader category of related traits for which a heritability estimate was provided in the meta-analysis. This estimate is considered superior to individual estimates for a specific trait, which might suffer from smaller independent samples. For example, since few twin studies on political orientation have been conducted, the category was chosen that included this trait in the meta-analysis (“societal attitudes”). A broad category also allows for a looser interpretation of the definition of the trait by lay people. Since many people may not have a sense of the psychometric definition of intelligence, for example, the category “higher-level cognitive functions” may align more closely with a lay understanding. One trait did not have a recent heritability estimate available (colorblindness) and was omitted from accuracy analyses.

Table 8 Mean values for LEGIT (lay estimate of genetic influence on traits) for each of the 21 traits surveyed in a sample of 1041 participants and their associated published heritability estimates

The correlation between lay estimates and published estimates is 0.77Footnote 1 (Fig. 2), making it among the largest of all associations found in the dataset. Lay estimates of some traits aligned much more closely with published estimates than others. Although lay estimates of genetic influence on physical traits (eye color, height, and blood group) tended to be highly accurate, estimates of behavioral traits (including musical talent, alcoholism and personality) also were among those most closely aligned with their published counterparts. Sexual orientation represents the largest discrepancy between lay and published estimates, with most participants overestimating the genetic contribution with respect to the published literature. Removing this trait from the correlation boosts it to 0.84. Other notable discrepancies include breast cancer, which was generally overestimated, and obesity, which was underestimated.

Fig. 2
figure 2

Scatterplot and associated regression line of mean lay estimates of genetic influence for 20 human traits (excepting colorblindness; converted from 1 to 5 Likert scale to 0–1 variance scale) in Mechanical Turk sample of 1041 participants, along with published estimates of heritability for these traits from meta-analyses and large-scale twin studies. The correspondence between lay estimates and published estimates is r = .77 (r2 = .59). Points are color-coded to their group membership according to the results of a four-factor solution of all responses

Why are some people more accurate than others?

Individual differences in accuracy were readily apparent. Once converted to the 0–1 “heritability” metric, lay estimates of genetic influence can be used as an individual-level measure of accuracy by subtracting each participant’s score on each trait from that trait’s published estimate. The absolute value of this difference score therefore can be used to represent the distance of each participant’s estimate from the empirical estimate. Averaging this distance across all 20 traits (omitting colorblindness) can then function as a single variable to roughly capture the accuracy of each participant’s view of the genetic influence on human traits across the surveyed domains. The mean, standard deviation and range of difference in accuracy for each trait is shown in Table 9, and the mean absolute difference score of all surveyed traits together is 0.18 (SD = 0.05). (For a full comparison of difference scores with their directionality preserved, thus indicating over- and underestimation of each trait’s genetic influence, see Supplementary Fig. 1).

Table 9 Range, mean, and standard deviations for absolute difference scores on genetic influence for each trait (N = 1041) and correlations of both essentialism scales with these difference scores (N = 301)

These mean indices of accuracy across participants have significant associations with several measured variables. While accuracy index does not correlate significantly with any FAD + or FWI subscale of free will (FWI: p = .71; FAD+: p = .77) or determinism (FWI: p = .54; FAD+: p = .98) it does correlate modestly with both essentialism scales: genetic (BGD) and social (BSD) determinism at r = –.12 (p = .03) and r = –.13 (p = .02), respectively, indicating that stronger genetic and social deterministic beliefs are associated with less distance from the mean empirical heritability for these traits. Table 9 shows the association of BGD and BSD with each accuracy index score across traits as well as with the overall mean accuracy score. While the significant association of mean accuracy with BSD is due to an overall pattern of weakly negative correlations, the association with BGD is due to fewer stronger ones, especially for obesity (r = – .19), intelligence (r = – .17), personality (r = – .21) and violent behavior (r = – .28), all p < .01 (Table 9). In a regression model predicting overall accuracy, BGD and BSD together explain only 3% of the variance (p = .01).

Though overall mean accuracy scores do not correlate significantly with political orientation (p = .19), accuracy scores on four individual traits do. Schizophrenia (r = – .10, p < .01), diabetes (r = – .08, p < .01), and alcoholism (r = .10, p < .01) all have weak negative associations with political orientation, meaning that conservatives tend to be slightly more inaccurate than liberals on these particular traits. Accuracy on one trait, sexual orientation, is negatively associated with a more liberal orientation, and this association is considerably larger at r = .22 (p < .01). The size of this accuracy difference is large enough to change the overall correlation between published and lay estimates of genetic influence from r = .77 in the full sample to r = .81 for all conservatives and r = .73 for all liberals, meaning that the lay estimates of conservatives align more closely with published estimates than those of liberals when sexual orientation is included in the table of traits (Supplementary Fig. 2). The difference in accuracy scores between “very liberal” and “very conservative” individuals, as well as how those of moderates compare, can be seen for every trait in a series of bar graphs comprising Supplementary Fig. 3. For sexual orientation, all political groups overestimated the known proportion of genetic influence on sexual orientation, though “very liberal” individuals did so the most and “very conservative” individuals the least.

In addition to scores on the BGD and BSD scales of belief in determinism, several surveyed demographic variables significantly predict higher accuracy in overall estimates of genetic influence. These include genetic literacy scores (r = – .17, p < .01), years of education (r = – .12, p < .01), number of children (r = – .13, p < .01) and age (r = – .11, p < .01). The negative direction of these correlations indicates that higher values on all four of these qualities predict greater accuracy in estimates (less distance from published estimate). Gender is also associated with accuracy: women are significantly more accurate in their predictions of genetic influence on traits than are men, with a mean distance from the published estimates of 0.17 for women (SD = 0.05) and 0.19 (SD = 0.05) for men (t(1036) = − 4.8, p < .01). While this mean difference is small, women are more accurate on 17 out of the 20 traits, significantly (p < .01) for five traits and marginally for another five (p < .05). Men are not significantly or marginally more accurate on any surveyed traits. The largest significant (p < .01) differences in accuracy between men and women are for diabetesFootnote 2 (t = − 3.4), obesity4 (t = − 4.6), athleticism (t = − 4.5), ADHD (t = − 2.6), and musical talent (t = − 3.5). A multiple regression model that includes age, gender, education and number of children as predictors helps clarify which of these remain meaningful when controlling for others. (Genetic literacy scores were not included because they were not assessed in the second sample.) This model renders the effect of age nonsignificant (p = .17), but retains marginal to significant effects of years of education (p = .04), number of children (p = .02), and especially gender (p < .01). Gender, education and number of children jointly explain 5% of the variance in accuracy (p < .01). Do mothers of multiple children have the greatest accuracy overall? A two-way factorial ANOVA is conducted to compare the main effects of gender and number of children (none, one, and two or more) on mean LEGIT accuracy, and any possible interaction. These result supports a significant effect on accuracy of gender, favoring women (F(1, 732) = 9.56, p < .01), a marginal effect of number children (F(2, 732) = 4.18, p = .02), and no evidence for an interaction between them (p = .47) (Fig. 3). Together, the results of multiple regression and ANOVA across categories support the finding that educated mothers of multiple children are significantly more accurate than others in predicting the genetic influence on a number of human traits.

Fig. 3
figure 3

Comparison of absolute mean difference scores in accuracy of lay estimates of genetic influence across surveyed human traits for men and women for those without children, one child, and two or more children. These estimates represent the magnitude of distance between each participant’s estimate and the published estimate on a 0 (only environmental factors) to 1 (only genetic factors) scale. Lower mean distance (y-axis) therefore represents more accurate mean estimates across all traits. Error bars represent +/− standard error of the mean, and sample size for each group is displayed above each bar

Discussion

The aim of the present study was to add a novel assessment of beliefs about trait heritability to the ongoing debate about free will and determinism from a uniquely behavioral genetic perspective. In doing so, we have collated a corpus of beliefs and assumptions of individual Americans about genetic and environmental contributions to a variety of human traits, and have discovered that beliefs about genetic influence on these traits tend to cluster into four factors comprising physical, psychological, psychiatric, and lifestyle-oriented traits. That these categories are significantly associated with measures of free will and determinism, as well as with outcomes such as political ideology, is informative as to how the general public forms opinions about empirically established genetic relationships. While this correlational study does not engender causal inference, our findings are consistent with a complex and multifaceted origin and development of these beliefs. This provides a strong foundation for further probing these questions in a forthcoming study, using a sample of adopted siblings and their parents, with the ultimate goal of understanding the roles of parental influence, home environment, and genetic makeup in the formation and development of beliefs about free will and the genetic and environmental contributions to human behavior. This novel, genetically-sensitive angle will represent the first empirical investigation into the heritability of beliefs about heritability.

The current investigation found that multiple surveyed measures of free will and determinism are significantly related to the magnitude of scores comprising the physical trait factor of lay estimates of genetic influence. Both free will subscales (FAD+ and FWI) correlated positively with physical trait factor scores (r = .32 and r = .25, respectively; both p < .01), while the determinism subscales tended to correlate in the negative direction with roughly equal magnitude (Table 6). The only other significant association between the FAD+ and FWI scales with the trait factors is that of the determinism subscales (FAD+: SD, FWI: DET) with the psychological trait factor, at r = .22 and r = .17, respectively (p < .01). The key pattern to emerge is that belief in the existence of free will is related significantly to beliefs about high genetic contributions to purely physical traits, such as eye color and height, relative to other categories of traits, but belief in a deterministic universe is associated with stronger beliefs about the genetic contribution to psychological traits, such as intelligence and personality. A significant correlation between the FAD+ determinism scale and beliefs about the heritability of behavior might be expected given that some of the FAD+ items reflect belief in the biological determinism of behavior (e.g., “Parents’ character will determine the character of their children”). The FWI determinism scale, however, deals only with a philosophical conception of determinism (e.g., “Given the way things were at the Big Bang, there is only one way for everything to happen in the universe after that”). It seems that individuals who endorse the existence of a deterministic universe have a tendency to view genetic contributions to psychological traits as a way to account for behavior consistent with this worldview, although admittedly the correlations here are modest in magnitude.

The weak negative correlations between endorsement of free will and endorsement of determinism, pairs of all scales in the range of r = – .14 to – .18, offer support for the conclusion of Chan et al. (2016) that there is little empirical basis for the existence of a “compatibilist” or “incompatibilist” orientation of individuals towards questions of free will. Keller’s genetic and social determinism scales are particularly interesting, especially in their singular relation to the accuracy of genetic influence on traits among the measures of agency. It is unsurprising that the Belief in Genetic Determinism scale (BGD) is a strong predictor of high ratings of the genetic contribution to psychological traits (r = .54, p < .01), largely because many of the items in the scale are directly assessing attitudes about the heritability of psychological traits; e.g., “Intelligence is a trait that is strongly determined by genetic predispositions” and “Many talents that individuals possess can be attributed to genetic causes” (Keller 2005). It is more surprising that the BGD is one of the only free will/determinism-related measures that predicts accuracy as well (the other being BSD), with marginal mean associations with accuracy and the strongest individual predictions on accuracy for violent behavior, personality, obesity and intelligence (Table 9). Our data suggest that individuals who score low on the BGD scale, rather than holding a more enlightened view about the role of genetics in behavior, tend to discount or be unaware of empirical evidence on the heritability of behavioral traits. While Keller characterizes high scores on genetic determinism as being associated with “prejudice and in-group bias”, it is worth noting that no measure of free will or determinism predicted accuracy as well as the BGD, which was only weakly associated (positively) with authoritarianism and (negatively) with egalitarianism in our sample. More generally, belief in the existence of free will and an endorsement of higher or lower genetic or environmental contributions to particular traits in different categories show no consistent relationship (and often none at all). Therefore, no necessary connection appears to obtain between these factors in the minds of our participants, which lends further support to the conclusion that the distinction between compatibilism and incompatibilism does not capture actual patterns of variation in doxastic commitment in the broader population.

The association of political ideology with the magnitude of heritability estimates is also deserving of attention. That political ideology is associated with beliefs about and trust in scientific topics has been documented for climate change (McCright and Dunlap 2011), evolution (Nadelson and Hardy 2015), and vaccinations, nuclear power, and genetically modified organisms (Hamilton 2015). While the present study is likely the first to document the relationship between political orientation and opinions on the heritability of specific human traits, many of the significant outcomes seem to be in line with the results of related research. Conservatives are significantly more likely to believe that correlates of success and achievement, like intelligence and musical ability, have a greater genetic component, while liberals are more likely to think the same for traits that have been historically stigmatized, like psychiatric disorders and sexual orientation. Similar findings were documented by Suhay and Jayaratne, who suggested that these differences in attribution may result from the tendency of political ideologues to “endorse genetic explanations where their policy positions are bolstered by ‘naturalizing’ human differences” (Suhay and Jayaratne 2013). The current study replicates their finding that conservatives tend to endorse genetic explanations as causes of socioeconomic differences (intelligence, violent behavior, etc.) and liberals tend to endorse the same for sexual orientation. These patterns are consistent with research suggesting that moral judgments are a hallmark of the political split in the United States, which characterizes liberals, for example, as more inclined to endorse moral foundations built around care and compassion (Graham et al. 2009; Hirsh et al. 2010). This tendency may in part explain why left-leaning participants are more likely to endorse a genetic explanation for psychiatric disorders, which may inspire compassion and an understanding of immutability.

Despite these ideological differences in intuitions of genetic influence, there is no association between overall mean accuracy (distance from published estimates) and political ideology, as the collection of biases that forms at each end of the political spectrum seems to balance out on the whole. Conservatives, moderates and liberals together produce a correlation between intuited and published heritability estimates that is among the strongest of any relationship found in the dataset (r = .77), indicating that even in the absence of genetic knowledge, and even if social attitudes bias individual assumptions, people’s observations and intuitions about the genetic contributions to human traits are relatively informed. This finding dovetails with work led by Celeste Condit indicating that laypeople’s ideas about heritability and genetic determinism are more nuanced than often assumed by scientists (Condit et al. 2006; Condit 2011).

This study is not without its limitations. Most of the reported significant correlations among measures of agency, social attitudes, and demographic variables are in the small to moderate range, and the large number of analyses almost certainly guarantees that some of these will prove to be spurious if replications are attempted. The non-representative nature of the Mechanical Turk sample likely limits the applicability of some of these modest correlations, particularly given that the sample is non-representative on some of the variables associated with outcomes (both samples are significantly more educated, more liberal, and less religious than the general population). Though the overall relationship of published estimates of heritability and lay estimates of genetic influence is robust, many of these published estimates will likely change as more traits become studied in larger samples, particularly for those whose estimates are taken from single studies rather than meta-analyses (e.g., sexual orientation). Individual estimates of “accuracy” are further tenuous predictors because the nature of difference scores tends to inflate noise, and these are no exception. Among the few significant predictors of accuracy that were pinned down, none of these together account for more than a total of 5% of the variance in accuracy.

Nevertheless, it is perhaps remarkable that accuracy was significantly predictable at all. That motherhood and education are the strongest demographic predictors of accuracy in estimates of genetic influence is consistent with the interpretation that people may develop their attitudes about nature and nurture with input from everyday observations and experience, rather than primarily from biases about social and political issues. This is also consistent with an interpretation of the finding of Crosswaite and Asbury (2018) that teachers of older children are more accurate in their beliefs about the genetics of educationally relevant behaviors: Belief in the malleability of younger children is common, but as they age this belief may be partially dispelled as the canalization of personality becomes more evident. Mothers may similarly develop more accurate perceptions as their children’s personalities emerge. While it is always possible that women with more accurate intuitions about the bases for individual differences are more likely to want children, parents, after all, have the ability to observe firsthand the results of an empirical experiment on the heritability of human traits in their own home. They can see that their children resemble them along multiple dimensions; furthermore, a parent of multiple children is able to see how the shared environment does not necessarily make them alike. Mothers may be uniquely observant of their children’s abilities, needs and attributes. Although it is clear that social and political biases are associated with the magnitude of these estimates, the best predictors of accuracy in the current sample are education and the experience of parenthood—an encouraging prospect for the public understanding of findings from behavior genetics.