Introduction

In modern, low fertility societies, variation in completed fertility is associated with genotypic differences between individuals (Byars et al. 2010; Harden 2014; Kirk et al. 2001; Mills and Tropf 2016; Milot et al. 2011; Pettay et al. 2005; Rodgers et al. 2000, 2001, 2003; Tropf et al. 2016; Zietsch et al. 2014). What links variation in completed fertility, a phenotype under immense social and evolutionary pressure, with standing genetic variation? One explanation is that genetically influenced characteristics influence fertility to different extents or in different ways across context, reducing the winnowing effect of selection or activating novel genetic influences. For example, when individuals have greater freedom to pursue fertility patterns unrestricted by social or economic constraints, a greater proportion of variation in fertility is associated with genetic influences (Bras et al. 2013; Briley et al. 2015; Kohler et al. 1999; Tropf et al. 2015a, b). These results (i.e., greater heritability in socially tolerant or economically prosperous environments) indicate that features of the socio-cultural context interact with the expression of genetic influences on fertility. Udry (1996) predicted this effect for low fertility societies. He argued that as social norms and control over fertility weakened (e.g., DeLamater 1981; Lesthaeghe 2010; Lesthaeghe and van de Kaa 1986; van de Kaa 1987), genetically influenced individual differences would become increasingly linked with the phenotypic expression of fertility. Yet, several potential behavioral mechanisms eliciting the heritability of fertility are found in the literature. Here, we empirically contrast explanations centering on family formation timing, educational attainment, and psychological traits within two large, genetically informative samples of adults. Following previous work (e.g., Rodgers et al. 2001), we anticipated that fertility precursors, such as fertility timing (i.e., age at first marriage and first birth), would be able to account for the majority of genetic influences on completed fertility, and that genetically influenced psychological characteristics could account for variation in completed fertility, potentially indirectly though influences on fertility timing.

Explanations for Genetic Influences on Completed Fertility

Generally, genetic influences on completed fertility may be observed if other genetically influenced phenotypes have an effect on levels of fertility. For example, preferences for family size are partially genetically influenced, and these genetic influences are shared with levels of completed fertility (Miller et al. 2010). In this example, individuals with genetic predispositions to desire large family sizes tend to have larger families, resulting in genotypic variation becoming linked to variation in completed fertility. Individuals also differ in respect to the timing of their first birth and their first marriage. These phenotypes are partially genetically influenced, and delayed family formation timing is associated with lower completed fertility (Kohler et al. 2002; Rodgers et al. 2007; Trumbetta et al. 2007). Similarly, individual differences in the pursuit of educational attainment, rather than family formation, are genetically influenced (Heath et al. 1985; Rietveld et al. 2013) and associated with lower completed fertility (Kohler and Rodgers 2003; Nisén et al. 2013). Psychological characteristics, such as personality (Eaves et al. 1990; Gurven et al. 2014; Jokela 2012; Skirbekk and Blekesaune 2014) and cognitive ability (Hopcroft 2006; Udry 1978; Van Court and Bean 1985; von Stumm et al. 2011), have also been linked to completed fertility. Variation in these phenotypes is substantially influenced by genotypic differences (Bouchard and McGue 2003).

To complicate matters, these potential explanatory phenotypes are all intercorrelated. Educational success is strongly predicted by cognitive ability (Deary et al. 2007) and less strongly but substantially by personality (Poropat 2009). Cognitive ability and personality dimensions are correlated and developmentally intertwined (Cattell 1987; Goff and Ackerman 1992). Personality is predictive of fertility intentions (Avison and Furnham 2015; Hutteman et al. 2013), marriage timing (Jokela et al. 2011), and childbearing timing (Jokela et al. 2010). Delayed fertility timing is also predicted by cognitive ability (Neiss et al. 2002) and educational attainment (Mills et al. 2011; Rindfuss et al. 1996). Moreover, many of these associations have been shown to be due to common genetic influences, further obscuring the precise mechanism linking genetic variation to completed fertility (Hagenaars et al. 2016; Harris et al. 2015; Krapohl et al. 2014; Neiss et al. 2002; Nisén et al. 2013; Rietveld et al. 2014; Wainwright et al. 2008).

Considering Lifespan Development

An alternative interpretation of the previous literature is that fertility behaviors influence family formation, educational, or psychological development. For example, Jokela et al. (2009) followed participants over 9 years and found that the experience of having a child was associated with personality change for the dimension of emotionality. A statistically significant effect was not found for either sociability or activity, two other personality dimensions measured in the study. In contrast, the authors report selection effects (i.e., of personality predicting fertility change) consistently and with larger effect sizes. Further, the longitudinal effects of fertility on personality change do not replicate consistently (e.g., van Scheppingen et al. 2016), yet longitudinal studies consistently find that early personality predicts later fertility (e.g., Hutteman et al. 2013; Jokela and Keltikangas-Järvinen 2009; Jokela et al. 2010). Additionally, basic features of the lifespan (e.g., age at first birth occurs before completed fertility) help to delineate the direction of effects. Cognitive ability, personality, and educational attainment undergo the most dramatic developmental change during the first quarter of the lifespan, largely before fertility behavior is typically observed (Barro and Lee 2013; Briley and Tucker-Drob 2014; Roberts et al. 2006; Tucker-Drob 2009; Tucker-Drob and Briley 2014). Of course, these empirical regularities do not rule out the possibility for an unexpected pregnancy to hinder educational ambitions, for example. Although we acknowledge that reverse causality is possible and difficult to distinguish with cross-sectional data, the weight of evidence supports treating family, educational, and psychological development as at least partially explanatory phenotypes for completed fertility, consistent with the large body of behavior genetic work on fertility behaviors.

Goals of the Present Study

All of the discussed genetically influenced phenotypes may offer potential explanations for the heritability of completed fertility. As strict social norms for fertility have slowly loosened to permit a variety of pathways to family formation, individuals are allowed greater freedom to pursue family sizes in line with their genetically influenced preferences, goals, and values. Unfortunately, previous studies have primarily focused on single phenotypes and not taken a multivariate approach. This limitation hinders the ability to determine whether these associations are unique or shared with other factors, a requirement for properly identifying the mechanisms of genetic influences on fertility. Furthermore, most previous genetically informative studies have focused on a single sociocultural context, typically northern Europe. It is unclear how generalizable previous findings are to regions with different political, economic, and racial/ethnic composition. The current study addresses these issues by simultaneously testing many competing accounts of the heritability of completed fertility in U.S. and U.K. samples.

Method

Participants

Fertility data from the United States were drawn from the Midlife Development in the United States Study (MIDUS), a two-wave nationally representative study of adulthood (Ryff et al. 2006). This sample (n = 7108) includes a twin subsample of monozygotic pairs (n = 354) and dizygotic pairs (n = 579). The sample reflects the diversity of the U.S. population. At the initial wave (1994/1995), participants ranged in age from 25 to 74 years old (M = 46.38 years, SD = 13.00), and the second wave took place approximately 10 years later. As described below, we made use of both waves of data to obtain complete fertility histories even for the youngest participants. To account for mean-level differences in fertility practices across birth cohort and sex, we control for age and sex in all analyses (described more fully below). For the relatively stable demographic and psychological characteristics, we used only the initial measurement wave to limit the potential effect of attrition or age-related change (e.g., Lucas and Donnellan 2011). In the full sample, a similar number of males (n = 3395) and females (n = 3632) participated. The racial composition of the sample was predominantly White (n = 5600), but participants identified as Black (n = 321), Native American (n = 37), Asian or Pacific Islander (n = 57), some other race (n = 119), and multiracial (n = 42).

Fertility data from the United Kingdom were drawn from the TwinsUK registry (Moayyeri et al. 2013). In contrast to the MIDUS data, the TwinsUK data came exclusively from White female twins, who voluntarily participated in the study. Therefore, this sample is not considered nationally representative. The data used for the current project included 744 monozygotic pairs and 940 dizygotic pairs. Participants ranged in age from 32 to 82 years old (M = 58.03 years, SD = 9.89). Phenotypic data were collected as part of on-going primary data collection for TwinsUK which began in 1992 and from behavioral questionnaires administered in 1999, 2000, and 2005.

Measures

We drew 12 variables from both datasets. A measure of completed fertility was the primary phenotype. To explain variance in completed fertility, we used measures of age at first birth, age at first marriage, educational attainment, extraversion, agreeableness, conscientiousness, neuroticism, openness to experience, and cognitive ability.

Completed Fertility

A measure of completed fertility was constructed based on the participants’ total number of biological children. In MIDUS, participants reported this information at both waves of assessment, and both sources of information were incorporated to create a single variable. The reported number of biological children may be censored by the timing of the survey for younger participants (i.e., those 34–50 years old). In the United States in 2010, over 85 % of period fertility resulted from individuals less than 34 years old, the youngest age observed in MIDUS (Human Fertility Database 2013). Additionally, 99 % of period fertility in the United States resulted from individuals under 41 years of age, and over 85 % of the MIDUS sample was over 41 years old. For the vast majority of the sample, completed fertility is known, but additional fertility may occur for a small fraction. This is an important, but minor, limitation. We explicitly test for bias introduced by censoring by omitting any censored observations. The average participant had 2.09 children (SD = 1.60).

We constructed a similar measure in TwinsUK based on responses from a number of survey waves. Information about fertility was asked in a variety of ways over several iterations of the survey materials. For example, participants were asked the year their children were born or the total number of children they had. We assigned participants with the highest completed fertility reported at the latest age. Similar to MIDUS, the TwinsUK dataset does not suffer from serious censoring. In the United Kingdom in 2010, 70 % of period fertility occurred to individuals less than 32, the youngest age observed in TwinsUK (Human Fertility Database 2013). Additionally, 99 % of period fertility in the United Kingdom occurred to individuals under 41 years of age, and 95 % of the TwinsUK sample was over 41 years old. Censoring is likely to be an even smaller issue for the TwinsUK data, and again we explicitly test whether our results hold when censored observations are excluded. The average participant had 2.05 children (SD = 1.19).

Age at First Birth

A measure of age at first birth was constructed based on the participant’s age at the time their eldest child was born. In MIDUS, participants reported this information at both waves, and this information was integrated. For childless participants, their current age at the time of the survey was entered as their age at first birth, which is common practice for these right censored cases. Childless participants over 50 years of age are unlikely to have children for biological reasons. Following the precedent of previous studies (e.g., Kohler et al. 1999), age at first birth was entered as 50 years of age for childless participants over 50 in order to reduce outliers. The average participant had their first child at 28.83 years of age (SD = 9.35).

In TwinsUK, we constructed a similar variable based on responses regarding the date of birth of the participant’s eldest child, except we used an upper limit of 45 years rather than 50 years. This was due to the fact that the TwinsUK dataset was entirely composed of females, whereas MIDUS includes some male participants. Female fertility tends to decline across age at a faster rate compared to males (Utting and Bewley 2011). In fact, we observed first births in the full sample of MIDUS between ages 46–49 (n = 147). The average TwinsUK participant had their first child at 29.22 years of age (SD = 8.30).

Age at First Marriage

A measure of age at first marriage was constructed based on information reported at both waves of assessment in MIDUS. This variable was constructed similarly to age at first birth in that unmarried individuals were assigned their current age capped at 50 years of age. Although there is not the same sort of biological limit on age at first marriage as there is for age at first birth, the same coding was applied to maximize comparability. Further, no participants reported a first marriage after age 50. The average participant was first married at 25.87 years of age (SD = 8.15). In TwinsUK, data on marriage timing was unavailable for 621 twin pairs (37 % of the sample), leaving a total of 584 monozygotic pairs and 682 dizygotic pairs. For this subset of participants, the average age at first marriage was 23.55 years (SD = 4.96).

Educational Attainment

Participants in MIDUS reported their educational attainment at the first assessment wave. Substantial variability was observed for educational attainment. Participants obtained some grade school (n = 38), eighth grade/junior high school (n = 127), some high school (n = 516), General Educational Development (i.e., high school equivalent; n = 109), high school degree (n = 1951), 1–2 years of college (n = 1302), 3 or more years of college (n = 333), 2 year degree (n = 538), bachelor’s degree (n = 1240), some graduate school (n = 197), master’s degree (n = 487), or a professional degree (n = 257). In TwinsUK, participants obtained different levels of qualification: no or other (n = 438), clerical (n = 253), O-level 1–4 (n = 273), low vocational (n = 128), O-level 5 + (n = 290), middle vocational (n = 139), A-level (n = 127), higher vocational (n = 567), or university (n = 418).

Big Five Personality Traits

At the first MIDUS assessment wave, participants indicated the accuracy of several self-descriptive adjectives on a 4-point Likert scale ranging from not at all (1) to a lot (4). Adjectives were selected to index extraversion (“outgoing, friendly, lively, active, talkative”), agreeableness (“helpful, warm, caring, softhearted, sympathetic”), conscientiousness (“organized, responsible, hardworking, careless”), neuroticism (“moody, worrying, nervous, calm”), and openness to experience (“creative, imaginative, intelligent, curious, broad-minded, sophisticated, adventurous”). The mean response was taken, reverse coding where necessary. Internal consistency was good for extraversion (α = .78), agreeableness (α = .80), neuroticism (α = .74), and openness to experience (α = .77), but was substantially lower for conscientiousness (α = .58). The average response for extraversion, agreeableness, conscientiousness, neuroticism, and openness to experience was 3.20 (SD = 0.56), 3.49 (SD = 0.49), 3.42 (SD = 0.44), 2.24 (SD = 0.66), and 3.02 (SD = 0.53), respectively.

Participants in TwinsUK responded to the Ten Item Personality Inventory using a 7-point Likert scale ranging from strongly disagree (1) to strongly agree (7; Gosling et al. 2003). This very brief measure uses two items to measure each of the Big Five traits. The validity of this inventory is well-established with substantial convergent validity with longer measures (mean r = .77). Dependability coefficients (i.e., test–retest stability) are similarly high with meta-analytic estimates between .66 and .81 (Gnambs 2014). The average response for extraversion, agreeableness, conscientiousness, neuroticism, and openness to experience was 3.45 (SD = 1.58), 4.61 (SD = 1.08), 4.91 (SD = 1.09), 2.30 (SD = 1.41), and 3.80 (SD = 1.28), respectively.

Cognitive Ability

At the second assessment wave of MIDUS, participants completed the Brief Test of Adult Cognition by Telephone, an instrument designed to assess cognitive ability (Tun and Lachman 2006). This variable was only assessed at the second measurement wave (n = 3973, 56 % of original sample). In the twin subsample, there were 165 complete monozygotic pairs and 241 complete dizygotic pairs, as well as 123 incomplete monozygotic and 227 incomplete dizygotic pairs. Incomplete pairs were retained for analysis as they inform phenotypic associations. A composite was taken based on z-scores of tests of immediate word list recall, delayed word list recall, digits backwards, category fluency, number series, and backward counting. By creating a composite, this variable assesses general cognitive ability. A subset of TwinsUK participants completed measures of verbal ability, pattern recognition, and spatial working memory which were used to create a composite. These variables were only available for 409 individuals, limiting the utility of analyses using this variable. We present cognitive results from TwinsUK as tentative replications of the MIDUS results.

Analytic Approach

Quantitative genetic methodology makes use of correlations between family members with known genetic similarity. In the classical twin design (Neale and Cardon 1992), reared together monozygotic twin pairs are compared to reared together dizygotic twin pairs to estimate different variance components. Additive genetic effects (A) index variation in a phenotype that is associated with genotypic sequence variation between individuals. Shared environmental effects (C) index variation associated with between-family differences (i.e., effects that make siblings living in the same home similar). Nonshared environmental effects (E) index variation associated with within-family differences (i.e., effects that make siblings living in the same home different, plus measurement error or other forms of measurement uncertainty, such as unreliable recall of dates).

In the classical twin design, the variance decomposition is accomplished by comparing the similarity of monozygotic twins, who share identical genetic material, with dizygotic twins, who share on average 50 % of segregating genetic material. If monozygotic twins are more phenotypically similar than dizygotic twins, this implies genetic influences on the phenotype. To the extent that twins are more similar to one another than implied by genetic influences, this is attributable to shared environmental influences. To the extent that monozygotic twins are dissimilar, this is attributable to the nonshared environment. In multivariate extensions of the classical twin design, cross-twin cross-phenotype correlations are the primary statistic of interest. If one twin’s score on a phenotype is a better predictor of the other twin’s score on a separate phenotype for monozygotic twins compared to dizygotic twins, this implies genetic influences on the covariation of the two phenotypes.

Figure 1 displays the primary analytic approach for the current study. A Cholesky model (Loehlin 1996; Neale and Cardon 1992) was used to partition the variance in completed fertility into genetic and environmental variance that is shared with the predictor variables and unique residual variance. In this context, the cross-pathways are the primary parameters of interest. If the a 21 parameter is significant, this indicates that genetic influences on the predictor are shared with some of the genetic influences on completed fertility. If the c 21 parameter is significant, it indicates that there are between-family influences on completed fertility that are shared with the predictor variable (e.g., childhood socioeconomic status, religious upbringing, race/ethnicity). If the e 21 parameter is significant, it indicates that there are within-family influences that are common to the predictor and completed fertility. Put differently, this parameter indicates whether the sibling that is higher (or lower) on the predictor is also higher (or lower) on completed fertility, after taking genetic and shared environmental confounds into account (D’Onofrio et al. 2013). Parameters labeled with a subscript of 11 indicate genetic and environmental influences on the predictor variable. Parameters labeled with a subscript of 22 indicate residual genetic and environmental influences on completed fertility after taking into account genetic and environmental influences shared with the predictor.

Fig. 1
figure 1

Example Cholesky model. Parameters with subscript “11” represent variance in the predictor. Parameters with subscript “21” represent variance in completed fertility shared with the predictor variable. Parameters with subscript “22” represent unique residual variance in completed fertility. Parameters are reported for genetic effects (a), shared environmental effects (c), and nonshared environmental effects (e). Latent variables represent genetic effects on the predictor (Ap) and fertility (Af), shared environmental effects on the predictor (Cp) and fertility (Cf), and nonshared environmental effects on the predictor (Ep) and fertility (Ef). Only one member of a twin pair is represented

It may be the case that multiple predictor phenotypes share variance with fertility. These phenotypes may share unique genetic variance in fertility or overlapping genetic variance with respect to the other phenotypes. To test for this possibility, the bivariate Cholesky model can be extended to include multiple phenotypes. In this context, interpretation of cross-paths is similar to multiple regression analysis in the sense that covariation in phenotypes is controlled. The ordering of the phenotypes is important for interpretation of the results. Phenotypes entered earlier (i.e., toward the left hand side) into the model account for variance in later variables, which can lead to faulty conclusions. For example, extraversion and agreeableness may both explain 5 % of the variance in completed fertility through shared variance, but because extraversion is entered into the model first, the effect will appear as if it is due solely to extraversion. We take two approaches to minimize this sort of error. First, we entered the phenotypes into the model based on the logical time course of phenotype development, with age at first birth entering before completed fertility. Second, we fit re-organized models to determine whether results are sensitive to phenotype ordering.

The analytic plan flowed through four primary steps. First, we evaluated univariate variance decompositions. Second, we used the genetic and environmental effects on each demographic and psychological variable to partition variance in completed fertility in a full multivariate model. Third, we fit a reduced model using all demographic and psychological phenotypes that accounted for significant portions of variance in completed fertility to provide a more parsimonious model. Finally, we estimate the robustness of our results to possible censoring by excluding participants that had not fully completed their childbearing years. To ensure that the results were not influenced by cohort trends in fertility or sex-differences, all analyses were conducted with phenotypes residualized for sex, age, age2, and a sex × age interaction, as is standard in quantitative genetic analyses (McGue and Bouchard 1984). In TwinsUK, the participants were all female, meaning it was not necessary to residualize for sex effects. All models were fit using full-information maximum-likelihood estimation with Mplus statistical software (Muthén and Muthén 1998–2010).

Results

Univariate Behavior Genetic Decomposition

Table 1 presents twin correlations and proportions of variance attributable to genetic, shared environmental, and nonshared environmental effects. Across all six fertility phenotypes, approximately 30 % of the variation was attributable to genetic effects, 7 % was attributable to shared environmental effects, and the remaining 63 % was attributable to nonshared environmental effects. In both datasets, variation in educational attainment was attributable to genetic effects (~39 %), shared environmental effects (~30 %), and nonshared environmental effects (~31 %). Similarly, variation in cognitive ability was attributable to genetic effects (~42 %), shared environmental effects (~9 %), and nonshared environmental effects (~49 %). Personality phenotypes routinely displayed monozygotic correlations more than double dizygotic correlations, implying an absence of shared environmental effects and possible dominant genetic effects. Consistent with a wide body of behavior genetic literature on personality (e.g., Vukasović and Bratko 2015), we focus on AE models.Footnote 1 Genetic effects accounted for approximately 36 % of variation in personality with the remaining 64 % accounted for by nonshared environmental effects. Despite the differences in specific measures and sociocultural context, the results were similar across datasets, as were the magnitudes of the demographic effect sizes.

Table 1 Twin correlations, univariate behavior genetic decomposition, and demographic effect sizes

Multivariate Genetic and Environmental Associations

We primarily focus on multivariate models of the genetic and environmental associations among the study phenotypes as such models have improved power over bivariate models. We fit a multivariate extension of Fig. 1 in which completed fertility was the final variable entered into the model. We entered personality phenotypes into the model first, followed by cognitive ability and educational attainment, and then age at first marriage and age at first birth.

Table 2 presents results from our multivariate model applied to both MIDUS and TwinsUK data, broken down by genetic, shared environmental, and nonshared environmental components of the model. The on-diagonal elements indicate the (residual) variance in the phenotype accounted for by genetic or environmental effects. These pathways only represent total variance for extraversion as it is the first phenotype entered into the model. For all subsequent variables, the on-diagonal parameter represents residual variance that is not accounted for by genetic or environmental effects of preceding phenotypes. Results for MIDUS are presented below the diagonal, and the results from TwinsUK are presented above the diagonal. To orient the reader to the table, the genetic association between agreeableness and completed fertility is .13 in MIDUS (reading down the agreeableness column), and the same effect is .20 in TwinsUK (reading across the agreeableness row).

Table 2 Results of multivariate Cholesky model

Several results are worth noting. In both MIDUS and TwinsUK, the model indicates that there are no remaining genetic or shared environmental influences on completed fertility after taking the other phenotypes into account. In MIDUS, the majority of the genetic effect is due to age at first marriage, with other significant associations with agreeableness and conscientiousness. Genetic effects on personality, cognitive ability, and educational attainment accounted for 5 % of the variance in completed fertility independently, 8 % of the variance in completed fertility via pathways through fertility timing (i.e., indirect effects), and 17 % of the variance was independently accounted for by the fertility timing phenotypes, leaving no residual genetic variance in completed fertility. In TwinsUK, the majority of the genetic effects on completed fertility were associated with age at first birth, with additional significant associations with agreeableness and conscientiousness. Genetic effects on personality, cognitive ability, and educational attainment accounted for 11 % of the variance in completed fertility independently, 10 % of the variance via fertility timing, and 17 % of the variance was independently accounted for by fertility timing, leaving no residual genetic variance in completed fertility. In both datasets, early fertility timing, high agreeableness, and low conscientiousness were associated with greater completed fertility through genetic pathways.

Shared environmental associations were less consistent across datasets. Higher levels of educational attainment were associated with delayed age at first marriage and birth. No individual association with completed fertility was statistically significant, but jointly the shared environmental influences on the preceding phenotypes were able to fully account for the shared environmental effects.

Turning to the nonshared environment, approximately 41 % of the variance in completed fertility was due to unique nonshared environmental effects across both datasets. A within-family association with completed fertility was found for cognitive ability, age at first marriage, and age at first birth in MIDUS. This result indicates that the (identical) twin that happens to have higher levels of cognitive ability tends to have higher completed fertility and an earlier age at first marriage and birth. The association could be due to some omitted variable (e.g., a childhood experience that affects both phenotypes independently, such that experimentally manipulating ability would not affect fertility), a longer causal chain (e.g., cognitive ability being rewarded in the job market, leading to greater mate value), a relatively proximal pathway (e.g., cognitive ability affecting fertility preferences), or reverse causation (e.g., fertility causally affecting cognitive development). Because the association occurs via a nonshared environmental pathway, the association is not due to genetic or between-family confounding. These effects did not replicate in TwinsUK. However, this failure to replicate is likely due to the very limited data availability of the cognitive phenotype. Across both MIDUS and TwinsUK, we found moderate nonshared environmental associations between each of the fertility phenotypes, such that earlier age at first marriage was associated with earlier age at first birth and higher completed fertility, and earlier age at first birth was associated with higher completed fertility. Although perhaps not surprising, these results indicate that the time course for fertility behavior remains fairly structured, even in low fertility societies where delayed childbearing may have become uncoupled with completed fertility due to family planning. Nonshared environmental effects on personality, cognitive ability, and educational attainment accounted for 2 % of the variance in completed fertility, 1 % via fertility timing, and 20 % was independently accounted for by fertility timing in MIDUS. The similar percentages were 1, 0, and 14 % in TwinsUK.

Reduced Structural Model

The above analysis provides a comprehensive, but overly complicated, account of variance among the study phenotypes. We were interested in whether phenotypes with significant associations with completed fertility were sufficient to account for genetic variance in completed fertility, or whether the remaining non-significant effects were necessary. For MIDUS, we reduced the full model (as reported in Table 2) to include agreeableness, conscientiousness, cognitive ability, age at first marriage, age at first birth, and completed fertility. We included similar variables in TwinsUK, except for cognitive ability. Because cognitive ability was poorly represented in TwinsUK and to maximize consistency across results, we replaced cognitive ability with educational attainment in the model. These results are presented in Fig. 2 with non-significant effects omitted to reduce clutter, but these pathways were estimated in the model. All associations that were statistically significant in the previous model remained significant in the reduced model. The most important result to note is that in both MIDUS and TwinsUK the reduced model still fully accounted for genetic variance in completed fertility. In both datasets, the point estimate was zero. This result implies that the reduced set of variables is sufficient to fully account for genetic influences on completed fertility in the current samples.

Fig. 2
figure 2

Reduced multivariate Cholesky model. Only statistically significant pathways represented to reduce clutter, but all parameters were estimated in the model. Note that point estimates for genetic influences on completed fertility were estimated at zero. Standardized path coefficients reported with standard errors in parentheses. a Results for MIDUS. b Results for TwinsUK. Agree. agreeableness, Consc. conscientiousness, Edu. attain. educational attainment, CF completed fertility, AFB age at first birth, AFM age at first marriage

Sensitivity Analysis for Phenotype Ordering

Because the order that phenotypes are entered into the model can alter results, we estimated the reduced model for all permutations of phenotype ordering for the psychological, cognitive, and educational phenotypes. We did not re-order the fertility phenotypes because of the logical time ordering of the phenotypes and the relative lack of previous empirical examples of time-ordered effects from fertility to psychological development compared to the reverse pathway. We return to the limitation of cross-sectional data and reverse causation below.

Generally, all associations with completed fertility remained statistically significant no matter the ordering of the phenotypes. Somewhat surprisingly, the genetic association between conscientiousness and completed fertility was not robust when placed earlier in the model. This occurred in TwinsUK when conscientiousness was entered anywhere other than after agreeableness, with p values between .08 and .09. In MIDUS, this occurred once when conscientiousness was entered after cognitive ability but before agreeableness (p = .08). The effect size was relatively unaffected. This result may indicate that it is important to disentangle variance in agreeableness and conscientiousness, two personality dimensions that tend to be more highly correlated that other dimensions, when investigating fertility. Alternatively, this discrepancy may result from model imprecision, such that with larger sample sizes this distinction may be less important.

Robustness Test for Age Censoring

Although nearly all participants in both studies had completed their primary fertility years, a non-trivial amount had not. As such, our analyses may be biased by not fully tracking the childbearing behavior of some participants. To explicitly test for this effect, we re-ran the analyses reported in Table 2 with all observations with ages less than 45 years omitted, a common cutoff for the end of the reproductive span. This analysis did not alter any of our substantive findings concerning associations as interpreted by p values. In MIDUS, the average absolute parameter bias (i.e., difference in parameter estimate from the full dataset compared to the age-restricted dataset) was only .03. In TwinsUK, the similar statistic was .07. Bias tended to be more severe for shared environmental effects, potentially due to these associations being somewhat imprecisely estimated. Similarly, relatively large estimates of bias were found for associations with cognitive ability in TwinsUK, again most likely due to these parameters being imprecisely estimated due to data availability. Together, these checks indicate that our results are robust to effects of censoring, and the potential bias that is introduced is fairly small in magnitude.

Discussion

Individual differences in the level and timing of fertility are associated with genotypic variation between individuals. Although evolutionary selection pressures should act to limit additive genetic variation in fertility relevant phenotypes, modern reproductive behavior in low fertility societies is subject to substantial sociocultural influences that may interact with genetic predispositions. For example, some individuals may readily accept changing social norms and values for family formation (e.g., Lesthaeghe 2010), whereas others may respond more slowly. As social control over fertility practices diminishes in such sociocultural environments, individuals will be able to express their genetically influenced preferences, desires, goals, or other psychological phenotypes that potentially influence the level or timing of fertility to a greater extent (Udry 1996). For example, the heritability of fertility increases during periods of social change (e.g., the second demographic transition; Briley et al. 2015; Kohler et al. 1999; Tropf et al. 2015a, b), and during such changes, those with the most social capital display the largest increases in heritability (Bras et al. 2013). Due to such dynamic interaction between genetic predispositions and the rapidly changing sociocultural context of fertility, it is possible that levels of fertility will remain linked to genotypic differences between individuals.

Genetic influences on fertility are shared with genetic influences on other demographic and psychological characteristics. We found that genetic influences on completed fertility are strongly associated with genetic influences on age at first birth and age at first marriage. This implies that fertility timing is an individual difference marker for understanding genetic effects on fertility levels. Psychological phenotypes, such as agreeableness, conscientiousness, and cognitive ability, shared some genetic variation with fertility timing and completed fertility in our analyses. Across both datasets, little if any residual genetic variance on either age at first birth or completed fertility was found after partitioning shared variance with other phenotypes, including age at first marriage. Yet, a substantial amount of genetic variance in age at first marriage was associated with genetic effects not shared with the demographic or psychological phenotypes. This implies that there are other important phenotypes (or endophenotypes) that may be associated with fertility timing that we did not investigate.

In the current study, each of the fertility phenotypes was primarily associated with nonshared environmental variation, which includes unique life experiences, idiosyncratic or time-limited effects, and measurement error, which in the case of fertility phenotypes may include uncertain paternity or unreliable recall of dates. This means that efforts to understand the correlates of fertility timing will need to identify the systematic unique environmental effects that influence fertility trajectories. As an example from our analyses, those that marry earlier tend to have larger families, due to common genetic and nonshared environmental effects on both phenotypes. The nonshared environmental link represents a within-family association between early marriage timing and larger family size. Our model implies that part of the genetic association may occur via psychological phenotypes (e.g., personality or ability), but potential nonshared environmental pathways were not well-documented in this study. Beyond the strong nonshared environmental links among the fertility phenotypes, only cognitive ability was significantly associated with fertility phenotypes through a nonshared environmental pathway. The residual unique environmental effects may occur earlier in development, such as early dating relationships.

A number of explanations are present in the literature to explain the heritability of fertility. The current results support some explanations more than others. Fertility timing, both in terms of age at first birth and marriage, can largely explain genetic influences on the level of fertility in the current samples. This makes intuitive sense as these variables are very proximate to fertility, and fertility timing is likely influenced by similar motivational attributes as the level of fertility. In our model, agreeableness, conscientiousness, and cognitive ability emerged as the primary psychological phenotypes associated with fertility and fertility timing. Generally, these associations were modest in magnitude. Educational attainment was not strongly associated with completed fertility or age at first marriage in either dataset, and education-age at first birth associations were primarily shared environmental. Overall, these results imply that genetic influences on fertility may emerge through several psychological and demographic pathways that are complementary rather than competitive. In fact, our multivariate model identified several non-overlapping associations, indicating that it is necessary to consider multiple phenotypes and pathways. Of course, future work will be necessary to identify whether the identified associations are time-ordered or attributable to some omitted variable that could provide a more mechanistic account of fertility differentials.

The current study has several strengths and limitations. Two large, genetically informative, adult samples with in-depth psychological assessments from different sociocultural contexts were used to explore the genetic and environmental influences on the level and timing of fertility. Many of the primary effects replicated across datasets and with similar effect size estimates, adding further to the body of replicable results found in behavior genetics (Plomin et al. 2016). For example, we found similar levels of moderate heritability for the fertility phenotypes, strong genetic links between completed fertility and fertility timing, and modest links between fertility and agreeableness across both samples. Further, effects of similar magnitude across similar phenotypes have been reported in other large-scale twin studies (e.g., Miller et al. 2010; Rodgers et al. 2001). However, there are important limitations to consider. Some of the youngest members of the sample may not have fully completed their fertility at the time of the survey. Given the age of the youngest participants and their proportion of the total sample, this is likely a minor concern, and we empirically demonstrated that our results are not sensitive to the inclusion of censored observations. Yet, non-traditional practices, such as cohabitation, might play a role in fertility for these younger participants which we did not assess. Additionally, the assumptions of the twin model, such as the lack of assortative mating, the use of an additive model, and the equal environments assumption, are other potential concerns when estimating quantitative genetic models. A wealth of evidence supports the validity of these assumptions (e.g., Conley et al. 2013), and recent molecular genetic work using measured genetic information in unrelated individuals has found similar estimates of heritability for fertility phenotypes (Tropf et al. 2015a, b, 2016). Our models were limited by assuming a purely additive model for personality phenotypes, as the twin correlations implied that there may be dominant genetic effects. We fit models that could estimate these dominant genetic effects, but this approach substantially inflated standard errors for both the additive and the dominant genetic pathways. Future work on personality-fertility associations with larger sample sizes may be able to disentangle these pathways more accurately.

To ensure that the current results were not driven by gender differences or cohort trends, age and gender were controlled, as is common in quantitative genetic analyses (McGue and Bouchard 1984). However, the genetic and environmental associations likely differ across birth cohort or gender (e.g., Kohler et al. 1999). Although the current samples are large compared to many twin studies, they are not sufficiently powered to detect the effects reported here when the sample is broken down by gender or specific birth cohorts (particularly because TwinsUK includes only female participants). It may be the case that many of the effects are strengthening over time (Jokela 2012; Skirbekk and Blekesaune 2014), and analyses of recently born individuals, who only experience loosely structured fertility norms, would show stronger associations with psychological phenotypes. Alternatively, other effects, such as the strong link between genetic influences on age at first birth and marriage, may be diminishing in magnitude as the institution of marriage is increasingly decoupled from fertility (e.g., Smock and Greenland 2010). The substantial age heterogeneity of both samples may also obscure effect sizes because of potential differential cohort and period effects on fertility. Future studies in narrow age cohorts would help clarify the magnitude of this limitation. Further, sex-limitation models may aid in explaining the persistence of genetic influences on fertility phenotypes by demonstrating antagonistic pleiotropy (Neale and Cardon 1992). In light of these limitations and differences across samples, it is noteworthy that results were similar across datasets indicating that the inclusion of males, with the potential for uncertain paternity, does not substantially alter the reporting of fertility, at least for purposes of the current study.

Interpretation of the presented models assumed that demographic and psychological variables took chronological precedent over fertility variables. However, bidirectional effects between fertility and psychological development have been documented (Jokela et al. 2009; Kohler et al. 2005; Nelson et al. 2014), although these effects tend to be less replicable (van Scheppingen et al. 2016) and smaller in magnitude compared to the reverse pathway (Hutteman et al. 2013; Jokela and Keltikangas-Järvinen 2009; Jokela et al. 2010). Yet, the genetic and environmental cross-paths may be reasonably interpreted as genetic influences on fertility that have an effect on the demographic and psychological phenotypes. For educational attainment, personality, and cognitive ability, the development of these phenotypes is largely established before individuals enter the major childbearing years (Barro and Lee 2013; Briley and Tucker-Drob 2014; Roberts et al. 2006; Tucker-Drob 2009; Tucker-Drob and Briley 2014). This renders the current interpretation as the most plausible.

A further interpretational challenge with observational data relates to ruling out omitted variables that may spuriously induce associations. It may be the case that some other omitted phenotype or environmental factor not measured in the current study may be the true source of the association. However, large-scale international studies find no relation between personality development and fertility practices (Bleidorn et al. 2013). If omitted variables were prevalent and leading to spurious associations, then this effect should manifest in such cross-cultural studies. Similarly, the genetic associations between personality and fertility may be due to pleiotropic genetic effects on personality and fertility that act independently, such that experimentally manipulating personality would not alter fertility (e.g., agreeableness ← genes → fertility). On the other hand, such pleiotropic genetic effects may emerge through causal chains where genetic influences on personality later influence fertility through a variety of mechanisms (e.g., mate value, career opportunities, or more generally genes → agreeableness → fertility). In the second case, experimentally manipulating personality would be expected to alter fertility. Additional large-scale, genetically-informed longitudinal research would be required to parse apart these alternative explanations.

In conclusion, the current project demonstrates the importance of integrating genetically informative research into socio-demographic frameworks. In two large, genetically informative samples of adults, variation in completed fertility was linked with genotypic variation across individuals. This effect was largely explained by genetic influences on fertility timing. The timing of first birth and marriage represent early indicators for an individual’s ultimate fertility trajectory. Genetically influenced psychological phenotypes, such as personality and cognitive ability, are associated with some portion of the genetic influences on fertility, but much unexplained variance remains concerning the nonshared environmental effects (e.g., unique life experiences) that influence the level and timing of fertility.