1 Introduction

The potential health benefits of schooling have recently received increasing attention in the economics literature. This partly reflects a generally growing interest in the effect of schooling on nonmarket outcomes, such as fertility choices, criminal activity, charitable giving, trust, and voting behavior (see Lochner 2011, for a recent overview). If schooling has such nonmarket returns, estimates that only focus on the wage returns to schooling may seriously underestimate both the private and social returns to schooling.

The attention given to the topic also reflects an interest in possible ways to improve population health and to reduce socioeconomic inequalities in health. Schooling is strongly associated with a range of different health measures, and the relationship has been observed in a large number of countries and time periods, even after accounting for factors such as income, wealth, and health knowledge. If this relationship reflects a causal effect of schooling on health, increased expenditures on education may be a cost-effective way to improve population health, compared to other means, such as increased health care expenditures. Whether the association reflects a causal effect remains much disputed, however, and more evidence is needed before credible policy advice can be given.

In this paper, I estimate the effect of schooling on health using a twin design. By relating within-twin-pair differences in schooling to within-twin-pair differences in health and health behaviors, I am able to difference out the influence of genetic traits and family endowments that may otherwise bias the schooling coefficient. In the previous economics literature, the twin design has mainly been used to estimate the effect of low birth weight on adult education and income, the intergenerational transmission of education, and the wage returns to schooling (e.g. Ashenfelter and Krueger 1994; Bonjour et al. 2003; Black et al. 2007; Royer 2009; Holmlund et al. 2011; Pronzato 2012).

My paper is directly related to the different explanations for the positive association between schooling and health that has been proposed in the economics literature. Arguing in favor of a causal effect, Grossman (1972) proposed that educated people are more efficient in producing their own health, so that educated people are able squeeze out a greater health output from a given health input. Schooling may also increase the allocative efficiency in health production, in which case educated people are able to pick a better mix of inputs (Rosenzweig and Schultz 1982; Kenkel 1991). Alternatively, schooling and health may be related through factors such as family background and genetic traits. Fuchs (1982) proposed time preferences as one important factor, where less future-oriented people will invest less in both education and health, since the benefits of the investments are of a long-run character. Since factors such as time preferences and genetic traits are often unobserved, this creates a standard omitted variable problem. Third, there may be reverse causality so that early life health affects educational attainment. Some recent evidence based on samples of twins suggest, for instance, that low birth weight, being a marker of early life health, has a negative effect on schooling (see, e.g., Behrman and Rosenzweig 2004; Black et al. 2007).

The latter two hypotheses would suggest that schooling is endogenous in a regression of schooling on health. To deal with this, a number of recent studies have relied on various “natural experiments” to estimate the causal effect of schooling of health (see, e.g., Currie and Moretti 2003; Lleras-Muney 2005; de Walque 2007; Powdthavee 2010; Jürges et al. 2011).Footnote 1 The majority of these studies use educational reforms in order to identify the effect.

These studies rely on natural experiments that affect people whose return to schooling is likely to be different from the average returns in the population. Changes in mandatory schooling laws, for instance, were typically intended to increase the schooling of those at the lower end of the education distribution, while having little or no effect on those planning to proceed to further studies anyway. The resulting estimates therefore reflect local average treatment effects (LATE).

A twin design offers an alternative research design, where identification relies on differences in schooling within identical twin pairs. This usually also means that a twin design identifies the effect of education on health across the entire education distribution, whereas most instrumental variable (IV) studies provide estimates for those at the lower end of the education distribution. Twin-based estimates may therefore come closer to estimating an average treatment effect (ATE) than those based on reforms. The resulting estimate could therefore be expected to differ and a twin difference approach should therefore constitute a useful complement to the literature.

Very few studies, however, have exploited, to date, the twin design in order to estimate the health returns to schooling. One exception was Lundborg (2008), on which this paper is partly based, who conducted a detailed investigation of the relation between education and health using a twin design applied to US data. Fujiwara and Kawachi (2009), also using US twin data, focused on years of schooling and found no evidence of a causal effect of schooling on a range of health outcomes and health behaviors. A working paper by Behrman et al. (2006) found no causal effect of education on health, using data on Chinese twins. Behrman et al. (2011) examined the effect of schooling on hospitalizations and mortality among Danish twins. Their findings suggest that the negative association between education, on the one hand, and hospitalizations and mortality, on the other hand, disappears when exploiting within-twin-pair variation in schooling and health outcomes. A similar finding was reported by Amin et al. (2010), using data on UK twins and focusing on various health behaviors and health outcomes. Webbink et al. (2010) examined the effect of schooling on overweight using a twin design and found an effect only for men.

This paper contributes to the small literature on the health returns to schooling that uses a twin design. For this purpose, I exploit unique and detailed data on monozygotic twins from the Midlife in the United States (MIDUS) survey. These data have not been used, to the best of my knowledge, in any previous economic studies and contains rich information on health outcomes and education. Moreover, the data allow me to address some common criticism of the twin design. A common concern has been that differences in education within twin pairs are not exogenously given, i.e., that while twin differencing will remove the influence of unobserved factors common to a twin pair, there may still remain within-twin-pair differences in unobserved factors that affect schooling (Bound and Solon 1999).Footnote 2

In order to address such criticism, I contribute by exploiting the unusually rich MIDUS data and examine to which extent my results are robust to controlling for differences in early life factors, such as parental treatment, within twin pairs. Moreover, differences in time preferences between twins may explain why even identical twins end up with different schooling levels, in line with the Fuchs (1982) hypothesis. I will therefore consider this possibility by exploiting detailed questions about attitudes towards the future. In addition, I will address the reverse causality argument, i.e., that early life health affects educational attainment, by accounting for early life health differences within twin pairs.

Finally, whereas most previous twin studies focus on years of schooling, I allow for a more flexible functional form regarding the effect of schooling on health. I show that this has important implications for the results.

In contrast to most previous twin-based studies on the relationship between education and health, I find some evidence of a causal effect of schooling on health. Relative to high school dropouts, people with greater schooling are significantly healthier, as measured through self-reported health and chronic conditions. Moreover, some of these twin-based point estimates are even greater in magnitude compared to the corresponding ordinary least squares (OLS) estimates. Beyond completing high school, however, additional schooling does not generate any additional health gains. For physical activity, my twin-based estimates are again greater in magnitude, whereas for smoking and overweight, the twin-based estimates are substantially smaller in magnitude and are insignificant. My results are reasonably robust to accounting for differences in early life factors, such as parental treatment and early life health. Moreover, accounting for differences in attitudes towards the future does not alter my results.

Why are then some of my twin-based estimates larger in magnitude than the corresponding OLS estimates? I argue that one reason might be that the twin-based estimates are only identified for those pairs where a difference in schooling is present and that these twin pairs may face larger returns to schooling than the average person in the population. I show some evidence consistent with this idea where twin pairs showing a difference in schooling are more likely to come from low-educated backgrounds. In addition, I show that there is a tendency for those twins to have larger returns to schooling, which could thus explain the larger twin estimates.

The paper proceeds as follows: next, I discuss the empirical model. After that, I discuss the data used in the analyses and compare it to data from the Current Population Survey (CPS) in order to assess its generalizability. I then report the results, where the results from the pooled twin sample are contrasted with those obtained when applying a twin-difference strategy. Finally, the results are discussed and some conclusions are drawn.

2 Empirical strategy

In order to see how a twin-difference strategy may help to estimate the causal effect of schooling on health, let H 1j and H 2j denote the health of the first and second twin in the jth twin pair:

$$ H_{1j}=S_{1j}\beta +\mu _{j}+X_{1j}^{^{\prime }}\gamma +\varepsilon _{1j}, $$
(1)
$$ H_{2j}=S_{2j}\beta +\mu _{j}+X_{1j}^{^{\prime }}\gamma +\varepsilon _{2j}, $$
(2)

where S ij denotes the schooling of the first and second twin, \( X_{1j}^{^{\prime }}\) denotes a vector of other observable factors that may vary within a twin pair, μ j denotes unobserved genetic traits and family endowments at the family level, and ε ij is an unobserved random component. In this context, omitted variable bias may arise since μ j possibly affects both health and schooling. The \( X_{1j}^{^{\prime }}\) vector may include factors such as parental treatment, past health, time preferences, and birth weight, to the extent that such factors are observable.

Next, I take the difference between Eqs. 1 and 2, giving

$$ \Delta H_{j}=\mathbf{\Delta }S_{j}\beta _{WTP}+\Delta X_{j}^{^{\prime }}\gamma _{WTP}+\Delta \varepsilon _{j}. $$
(3)

where β WTP is the within-twin-pair estimate of the association between schooling and health. In this specification, all factors that are common to both twins in a given twin pair will be differenced out. Since twins share common genes, their influence will vanish as well as the influence of common family background. This means that an OLS estimate of Eq. 3 will no longer be biased due to unobserved twin-pair specific variables. Any remaining unobservables that remain in the error term after differencing may still, however, bias the results, if these unobservables are still related to both schooling and health.

The crucial assumption made in the literature is, thus, that schooling differences within twin pairs are exogenous, conditional on the fixed effect, and included covariates. A justified question is then what causes such schooling differences between otherwise very similar individuals. One example of exogenously given differences would be, for instance, if the twins in a given twin pair happened to end up with teachers of different quality.Footnote 3 This may then cause one of them to obtain more schooling, and it is not obvious that the quality of the teacher would influence health other than through obtained schooling. Another interpretation is given by Ashenfelter and Rouse (1998), who argue that random deviations from the optimum schooling level, stemming from optimization errors, may cause differences among otherwise similar individuals. Differences in education could also result from early differences in interests and activities that, in turn, may result from coincidences and more or less random events. While there is, of course, no way of proving that differences in schooling within twin pairs are exogenous, one could shed some light on the issue if data on early life differences between twins are available. Since the MIDUS contains rich information on such early-life factors, I will address these issues in the sensitivity analyses.

In all my regressions, I control for self-assessed health at age 16. This will then, to some extent, deal with the issue of reverse causality, running from early life health to education. In addition, it provides a control for a characteristic that may vary within twin pairs and that may predict schooling differences. I will not, however, control for marital status, occupation, or current income in the regressions since some of the effect of education on health may run exactly through these channels.

I will also provide a set of OLS regressions in order to obtain some “baseline” results. Besides self-assessed health at age 16 and schooling, these regressions will, in addition, control for age, gender, and race. Note that in the twin-fixed effect regressions, these variables are not included since they do not vary within twin pairs. In the regressions on smoking, I will also include a measure of smoking at age 16. This will account for any reverse causality running from teenage smoking to later schooling attainment. Since this variable can vary also within twin pairs, it is included in both the OLS and fixed effects regressions.

It is well known that the importance of measurement errors in years of schooling is exaggerated by differencing and even more so when differencing between identical twins (Griliches 1979). This will cause twin FE estimates to be downward-biased, under the assumption of classical measurement errors. Since our measure of years of schooling is imputed from categories, there is likely to be measurement error in the variable. I will consider this possibility in a section on robustness checks towards the end of the paper.

3 Data

My estimates are based on data from the first wave of the MIDUS survey. The first wave collected data in 1995 on a total of 7,108 individuals. To be eligible for the survey, participants had to be noninstitutionalized, English-speaking, living in the USA, and aged 25–74. The response rate for the telephone interviews in the first wave of MIDUS was 70 %. Among these, 86.3 % also completed a self-administered questionnaire, giving an overall response rate of 60.8 %.

Out of the 7,108 individuals interviewed, 1,914 were twins participating in the MIDUS twin screening project. In the project, a representative national sample of approximately 50,000 households were screened in order to identify families with twins. It should be noted that MIDUS was the first national sample of twins that was ascertained randomly via telephone. Using nationally representative data is an improvement compared to prior economic studies using twin data, such as that of Ashenfelter and Krueger (1994) and Ashenfelter and Rouse (1998), using highly selective data collected during the Twinsburg twins festival.

By using information collected as part of the initial twin screening questionnaire, twin pairs were diagnosed as identical or fraternal twins. Based on their answers to the questions, the twins were assigned points, which were subsequently totaled. “High” scores indicated identical twin pairs and “low” scores indicated fraternal twin pairs. In a small number of cases, the pairs’ score fell in the middle of the range, and no diagnosis was given (Brim et al. 2003). This method of diagnosing twin zygosity has proven reliable and has shown to be over 90 % accurate in diagnosing twin zygosity (e.g., Nichols and Bilbro 1966).

In the twin sample, 32 twins (16 twin pairs) were dropped due to uncertainty regarding zygosity. Of the remaining twins, 734, or 37 %, were identical twins, who were then selected for the analysis. I dropped three twins who had yet not finished their education. In addition, I dropped 19 twins whose id numbers were lacking and 18 twins whose information on the co-twin was lacking. This resulted in a final sample size of 694 identical twins (individuals).

3.1 Explanatory variables

Educational attainment was measured in 12 categories in MIDUS, ranging from no school/some grade school to PhD. For my analyses, I categorized this variable into four categories: college degree, college but less than a BA degree; a high school diploma; less than a high school diploma.Footnote 4

While years of schooling has been the standard measure in many prior studies on the wage returns to schooling, the educational degree may be as relevant, or even more relevant. In de Walque (2007), for instance, there is a sharp increase in the effect of number of years of schooling on smoking once reaching college. Similar evidence for nonlinear effects have been obtained in the literature on the wage returns to education (Hungerford and Solon 1987; Belman and Heywood 1991; Isacsson 2004). Based on such findings, some economists argue that credentials matter more than years of schooling (for a discussion on this, see Card 1999).

In line with previous studies using a twin-difference design, the twins in MIDUS often end up with similar educational attainment. Using the educational categories described above, 67 % of the identical twins in MIDUS report the same level of education.Footnote 5 For imputed years of schooling, 42 % are assigned the same number of years. In the regressions using educational categories, I use the category less than a high school diploma as the omitted reference category. This category, in principle, indicates being a high school dropout.

3.2 Health outcomes

As measures of health, I use self-assessed health and the number of chronic conditions. The former was assessed through the following question: “Using a scale from 0 to 10 where 0 means “the worst possible health” and 10 means “the best possible health,” how would you rate your health these days?” Self-assessed health has been found to be a strong predictor of subsequent mortality (see for instance Idler and Benyamini 1997). There are often some concerns expressed, however, about the interpretation of questions about self-reported health. Older individuals often report similar self-reported health as that of younger persons, despite having “objectively” worse health (Groot 2000). Note, however, that this is not an issue in the twin design since two twins are of the same age. I will also consider an additional health measure: the self-reported number of chronic conditions.Footnote 6

In addition to these health measures, I will also make use of measures on lifestyle. These are smoking, body mass index, and physical exercise. The measure of physical exercise indicates the number of occasions during the past month that the individual engages in vigorous physical activity. In the MIDUS questionnaire, “vigourous” is examplified by running or lifting heavy objects.

For both the health and the lifestyle measures, there is substantial variation within twin pairs. In 75 % of the pairs, the twins reported different levels of self-reported health. For chronic conditions, the corresponding figure was 71 %. The fraction of twin pairs reporting a difference in smoking, exercise, and BMI was 73, 16, and 99 %, respectively.

3.3 Representativeness of the sample

The external validity of twin-based estimates is sometimes questioned since twins may differ in various ways from the general population. It should be noted that this need not threaten the external validity of my estimates if the association between schooling and health is the same for twins and singletons. I will consider this in Section 4. In this section, I present evidence from Brim et al. (2003), where they consider to what extent the sample of identical twins in MIDUS resembles the MIDUS sample of singletons and the US population in general. For the latter purpose, Brim et al. (2003) use data from the CPS of 1995. In Table 1, descriptive statistics for the three samples are shown.

Table 1 Descriptive statistics

A comparison with the CPS data reveals that both the twin sample and the MIDUS main sample are better educated than the US population in general.Footnote 7 Similar patterns were found in several previous twin studies, possibly reflecting a selection of better educated twins into the surveys or that twins, for some reason, obtain more schooling than singletons (Ashenfelter and Krueger 1994; Ashenfelter and Rouse 1998; Bonjour et al. 2003). While similar in terms of gender distribution, the twin sample also contains more whites and has a slightly more compressed age distribution than the CPS sample. Regarding marital status, the CPS from 1995 does not contain a straightforward estimate of the number of cohabitating or married couples. Considering marriage alone, however, the fraction of married in CPS in 1995 was 67.5 %, compared to 71.6 % in the twin sample and 62.6 % in the MIDUS main sample.

4 Results

4.1 Self-reported health

The first three columns of Table 2 show the OLS and twin-differences results for self-reported health. In order to assess the external validity of the estimates based on the twin sample, I will start by comparing the association between schooling and health in the nontwin sample and the twin sample. Starting with the MIDUS nontwin sample, the results in the first column show a strong and positive association between schooling and self-reported health. Next, I run the same analysis on the twin sample but without including twin-fixed effects. I refer to this sample as the “pooled” twin sample, and the results are shown in the second column in Table 2.Footnote 8 The results are largely similar, although the magnitude of the associations is now somewhat increased. This suggests that the differences between the MIDUS main sample and the MIDUS twin sample in the distribution of characteristics does not lead to any radical differences in the estimated health returns to schooling.

Table 2 Regressions on self-reported health and the number of chronic conditions

In the third column, the results from the twin-differences approach are then shown. Relative to high school dropouts, people with greater schooling report significantly better health. The effect is to increase self-reported health with about one unit, measured on the 0–10 scale. Interestingly, the magnitude of the associations between the educational categories and self-reported health is about double the magnitude of the associations in the nontwin sample. These results are surprising, since one would expect a weaker relationship, once the influence of genes and family background common to the twins is controlled for. Also remember that these regressions control for self-reported health at age 16 in order to reduce the risk of the results reflecting reverse causality from health to education.Footnote 9 , Footnote 10 It should also be noted that there are no significant differences between the point estimates of the three dummy variables indicating different educational degrees. Thus, the results suggest that the main effect of education on health comes from completing high school, whereas additional schooling does not lead to any additional health gains.

For years of schooling, the OLS estimate based on the nontwin sample suggests a significant and positive association between schooling on self-reported health that is rather similar to the estimated association in the pooled twin sample. The point estimate based on twin differences is also similar, 0.067, but not significant. Remember though that this estimate is most likely downward-biased since my measure of years of schooling is imputed and thus likely contain some measurement error.

The only previous twin studies that estimated the effect of schooling on self-reported health are Fujiwara and Kawachi (2009) and Behrman et al. (2006). In the former case, the authors only used specifications including years of schooling, and the results are similar to my results. There is an interesting difference between my results and the results obtained by Behrman et al. (2006), however. They find a significant and positive effect of years of schooling on self-reported health but an insignificant effect when measuring schooling through categories. It is not clear what this difference represents but note that the study by Behrman et al. (2006) takes place in a developing country context and that the measure of self-reported health differs.Footnote 11 Also, degrees may have a different meaning in China compared to the USA, where they possible matter less for health in China than in the USA.

In the analysis of self-reported health, I treated the 0–10 scale of self-reported health as a cardinal scale. One may object that the scale is rather an ordinal scale, and treating each additional step on the scale to be of equal importance is restrictive. For this reason, I also created a set of binary variables, indicating whether the individual’s response to the scale crossed various threshold values. I then ran separate regressions, where a particular regression would, for instance, analyze the change in probability of stating a score of 4 or more on the 0–10 scale that is associated with an increase in an educational category or an increase in years of schooling. Since there were no twin pairs, where one twin scored below 2 on the scale and the other above 2, fixed effects regressions for these threshold values could not be performed. The results using other threshold values are shown in Table 8 in the Appendix and essentially support the results obtained above. The fixed effects results again show that, relative to high school dropouts, respondents with greater schooling are significantly more likely to score above most of the various threshold values analyzed. Moreover, the fixed effects estimates are again substantially larger then the corresponding estimates based on the pooled sample, which are shown in the lower panel in Table 8.

What could then explain the results that the twin-fixed effects estimate for self-reported health are substantially greater than the corresponding pooled OLS results when using educational categories? If well-educated people have favorable unobserved characteristics, it would seem likely that the same unobserved characteristics would be positively related to health. In such a case, we would expect the fixed effects estimates to come out as smaller since unobserved characteristics shared by twins are differenced out. In order to shed some light on possible reasons for the greater fixed effects estimates, I therefore, instead, turn to the possibility that my twin-based estimates have a LATE “flavor”. This would be the case, for instance, if differences in education within twin pairs, which I rely on to identify the effect of education, are more common among certain types of twin pairs. If the returns to education are also greater for these types of twin pairs, this could also explain why the fixed effects results come out as greater. A similar argument is often made in the IV literature, where the instrument for schooling is often mainly affecting people at the lower end of the education distribution.

Based on the arguments in the IV literature, I therefore check if differences in schooling within twin pairs are more common among twins coming from low-educated families. To do so, I analyzed if the probability of observing a difference in educational categories within a twin pair is systematically related to parental schooling (the results are shown in Table 9 in the Appendix). Measuring parental schooling by the same educational categories as for the main respondent and having no high school as the omitted reference category, the association between the mother’s schooling and the probability of observing a difference in educational categories within a twin pair was negative and significant for the highest educational category. The point estimate suggests that the probability of observing a difference in education within a twin pair was 15.6 percentage points lower when the mother had a university degree. For the father’s schooling, the results were similar, with twin pairs having a father with a college degree having a 13.4 percentage points lower likelihood of being different in terms of schooling.Footnote 12 This provides some evidence that differences in schooling are less common for twins coming from more highly educated families. One explanation for this may be that credit constraints are less binding in high-educated families, meaning that the family can afford to send both twins to a higher education. If the returns to schooling are greater for those coming from less educated backgrounds, which is usually argued in the IV literature, these results provide an explanation for the greater estimates for self-reported health obtained in my fixed effects models. Basically, the estimated effect in these models is to a larger extent identified on those twin pairs coming from families with a lower educational background.Footnote 13

Another reason why the fixed effect estimates exceed the corresponding cross-sectional OLS estimates could be that the influence of outliers becomes more pronounced in the fixed effects model. Such results were presented by Amin (2011), who showed that previous twin estimates in the literature are often extremely sensitive to just a few outliers. To check for this possibility, I plotted absolute differences in schooling between twins against absolute differences in self-reported health. The scatter plot is shown in Fig. 1, revealing one clear outlier, where there was a 10-year difference in schooling between two twins but no difference in self-reported health. I therefore reran the regression on self-reported health, this time, excluding this outlier. The results for both educational categories and years of schooling were virtually unchanged, however, both in terms of point estimates and significance (results available on request).

Fig. 1
figure 1

Check for outliers

4.2 Chronic conditions

Next, I consider the association between schooling and the number of chronic conditions. Columns 4–6 in Table 2 show the results for the three samples.

In the MIDUS main sample, schooling shows a strong and negative association with the number of chronic conditions, the association being strongest for the highest education category. In the latter case, having at least a college degree is associated with a decrease in the number of chronic conditions by 1.2 compared to being a high school dropout.

In the pooled twin sample, schooling again shows a significant and negative association with the number of chronic conditions. The magnitude of the associations is greater than the corresponding ones in the main sample, with the two highest education categories now being associated with a decrease in the number of conditions by 1.9 and 1.8, respectively.

The twin-differences estimates tell a similar story. The significant associations between schooling and the number of conditions remain for all education categories, except for the highest one, where the point estimate is still negative, however. It should be noted that there are no significant differences in the point estimates of the three dummy variables indicating schooling levels.

For years of schooling, there is a significant and negative correlation, − 0.125, with the number of chronic conditions in the MIDUS main sample. In the pooled twin sample, the correlation is somewhat smaller in magnitude and significant at the 10 % level. In contrast, the twin FE estimate no longer suggests any significant relationship between schooling and the number of chronic conditions.Footnote 14

4.3 Smoking, physical activity, and overweight

In order to examine the potential mechanisms through which schooling affects health, I will next investigate the association between schooling and various lifestyle factors. I will focus on smoking and overweight since these are two of the main causes of preventable deaths in the USA. In addition, I will consider the association between schooling and physical activity. In order to preserve space, I will only compare, from now on, the results from the pooled twin sample with the twin FE estimates.

Smoking   Starting with the pooled twin sample, data in the first column in Table 3 show a strong association between schooling and smoking that increases with the level of schooling. The regressions on smoking control for smoking at age 16, in addition to self-reported health at age 16. In contrast to the OLS results, the twin FE estimates for smoking are 4–20 times smaller in magnitude and are insignificant in all cases. It seems unlikely that measurement errors alone would generate these nonsignificant findings. The results are consistent with the hypothesis that unobserved factors, such as genetic traits and family endowments, are driving the results for the pooled twin sample.

Table 3 Regressions on smoking, exercise, and BMI

It is not obvious why education would be expected to have any causal effect on smoking, given the widespread knowledge about smoking risks (e.g. Lundborg 2007). On the other hand, education may increase the wage rate, which in turn increases the opportunity cost associated with a shorter life. Moreover, it should be noted that both Grimard and Parent (2007) and de Walque (2007) found that education mattered for smoking initiation when using an IV strategy. As instruments, they use indicators of the risk of induction during the Vietnam War and exploited the fact that college attendance could serve as a draft avoidance strategy. The resulting IV estimates will thus be rather specific LATEs, reflecting the effect only on those who actually changed their college decisions in response to variations in the risk of induction during the Vietnam War. If my twin estimates come closer to an ATE for the twin population, this may partly explain the different results.

Using years of schooling instead, a significant and negative association between schooling on smoking is again obtained in the pooled twin sample. The twin FE point estimate is only one third in magnitude, however, and, again, not significant. In sum, I obtain no evidence for a causal effect of schooling on smoking.

The results for smoking are similar to those obtained in previous studies, suggesting that unobserved endowments are mainly responsible for the association between education and smoking. Behrman et al. (2006), for instance, investigates the effect of years of schooling in China on the binary smoking decision and finds that the strong cross-sectional association disappears when imposing twin-fixed effects. Amin et al. (2010) exploit both years of schooling and educational categories but find no significant effects in their sample of UK twins.

Physical activity and overweight   Next, I investigate the association between schooling and physical activity and body mass index. Since MIDUS contains several measures of physical activity, I opt for the one that is most likely to reflect deliberate attempts to be physically active, i.e., vigourous physical activity during the winter.

In columns 3 and 4 in Table 3, I show the associations obtained for the pooled twin sample and the results from the twin FE estimation. In the pooled twin sample, having some college or having a college degree is associated with about two more occasions of physical activity per month compared to the reference category, whereas having graduated high school shows no significant effect. The results get even stronger when employing the twin FE estimator. Now, having some college or having a college degree are associated with an increase in the number of occasions of physical activity per month by more than three.Footnote 15 This suggests that there may be some unobserved endowment that is negatively related to schooling but positively related to physical exercise. This endowment is then taken out in the twin design, which will yield greater estimates. An example would be if there is a negative relation between the level of education and preferences for being active in sports or other physical exercise.

The results for physical exercise are somewhat difficult to compare to those obtained in previous twin studies since measurement differs to a great extent. Amin et al. (2010), for instance, used a dummy variable indicating moderate or heavy exercise during leisure time and found small and insignificant effects of education. Behrman et al. (2006) used a measure indicating monthly physical exercise participation, finding no significant effect of education.

In order to investigate to what extent the higher physical activity of educated individuals also transforms itself into lower body mass, I next examine the direct association between schooling and body mass index. Columns 5 and 6 in Table 3 show the results for the pooled twin sample and the results from the twin FE estimator. In the pooled twin sample, schooling shows a strong and negative significant association with BMI for all educational categories. Belonging to the highest educational category is associated with a three points decrease in BMI compared to the omitted reference category. The significance of these associations is completely swept away in the twin FE estimates. The point estimates of schooling are now, in most cases, only a tiny fraction of those obtained from the pooled twin sample and are no longer significant.

Columns 5 and 6 in Table 3 also show the corresponding results for years of schooling. In the pooled twin sample, years of schooling show a significant and positive association with physical activity and a negative and significant association with body mass index. These associations are no longer significant when employing the twin FE estimator.

The results for BMI and overweight are consistent with the results in Behrman et al. (2006) and Amin et al. (2010). Webbink et al. (2010), however, find a significant and negative effect between years of schooling and the probability of being overweight among men but not among women, when applying the twin design. To check if there are such gender-specific patterns in my data, I therefore reran the regressions on BMI and overweight by gender, both using years of schooling and educational categories. The estimates were small and insignificant in all cases, however (results available on request). It could also be noted that my results for body size are in line with the results from some recent IV studies, which find little or no evidence that education causally reduces overweight or body mass index (Arendt 2005; Kenkel 2006).

Summing up, the results in this section suggest that part of the effect of education on self-reported health may run through physical exercise. Moreover, in such a case, the beneficial effect must come through other mechanisms than through reduced BMI since I found no effect of education on BMI or overweight. One way to check this would be adding physical exercise to the equation for self-reported health and examine if the effect of education is reduced substantially. If so, it may suggest that the effect of education on health runs through physical exercise. The results did not suggest so, however, as the coefficients of the education dummies were only marginally reduced.Footnote 16 Thus, the effect on exercise is not large enough to account for the effect of education on self-reported health.

5 Sensitivity analysis

Differences in parental treatment and circumstances within twin pairs   I will start my sensitivity analysis by considering how the results are affected by allowing twins that were treated very differently by their parents and faced very different circumstances to be included in my sample. A key assumption in the twin-based literature is that twins faced very similar conditions and that any differences in terms of schooling are exogenously determined. Results in Lundborg (2010), however, suggest that MIDUS twins who reported that they were treated very differently or faced very different environments also more often ended up with different level of schooling. This concerned, for instance, whether the twins went to the same classroom, if they had the same friends, if parents emphasized the differences between them, if they were dressed similarly, etc.

One interpretation of such differences in treatment and circumstances is that they result from large differences in factors such as ability and health within twin pairs. Including such twin pairs in the analysis may therefore bias the results since the differences in schooling in such twin pairs may be endogenous. I therefore reran my regressions of the effect of schooling on self-reported health, chronic conditions, and exercise, excluding twin pairs that reported large differences along the lines discussed above. Thus, I removed twin pairs, where at least one twin reported that they never went to the same classroom, never dressed the same, never had the same friends, and were always treated differently by their parents. Table 4 first show descriptive statistics on these variables. As revealed by the table, most twins reported being rather similar across these dimensions, but this does not exclude the possibility that the previous results were driven mainly by the small fraction of twins who were rather dissimilar.

Table 4 Differences within twin pairs in early life

In Table 5, I show the results for self-reported health, when excluding dissimilar twins. The estimated coefficients of the education dummies are now similar in magnitude but are somewhat less precisely estimated. For chronic conditions, however, the schooling dummies are now much smaller in magnitude and are no longer significant, although still having negative signs. In the model of exercise behavior, the magnitude of the coefficients measuring completion of high school and having some college are very similar in magnitude but are much less precisely measured due to the smaller sample size. In sum, it seems that at least for self-reported health and exercise behavior, the results are robust to only including twins that reported facing very similar conditions early in life along a number of dimensions.

Table 5 Sensitivity analysis

Attitudes towards the future   As discussed in the introduction, Fuchs (1982) postulated that education and health are related only through common time preferences. If this hypothesis was true, one would expect the inclusion of variables indicating time preferences in the regression to render the effect of education on health insignificant. Next, I therefore reran my regressions, this time including variables that proxy for unobserved time preferences. For this purpose, MIDUS contains three different measures of the individual’s attitude towards the future. The measures are based on responses regarding the degree to which the respondent agrees to three statements: (1) “I live life one day at a time and don’t really think about the future,” (2) “I like to make plans for the future,” and (3) “I find it helpful to set goals for the near future.”Footnote 17 Respondents who agree that they “make plans for the future” or “set goals for the near future” could be considered to be more future-oriented, whereas people who agree that they “live life one day at a time” could be considered to be more present-oriented. These measures were used as proxies for time preferences in Knowles and Postlewaite (2005), for instance, where they were found to predict savings behavior in the expected direction.

Table 6 shows the results for self-reported health, chronic conditions, and exercise behavior with and without control for attitudes towards the future. I focus on these three outcomes since they were found to be significantly associated with education, as shown in Tables 2 and 3. In the regressions, I use the answer to the question regarding to what extent the respondent agrees with the statement that he/she lives life 1 day at a time since the results using the other measures were similar.Footnote 18 It is also important to note that the sample sizes are reduced due to missing data in the attitude questions. For assessing the influence of my proxy for time preferences, the relevant comparison is thus the results with and without controls in these reduced samples.

Table 6 Sensitivity analysis

Starting with self-reported health, the effect of education on health is significant and similar in magnitude in both the specifications excluding and including my measure of time preferences. The point estimates also remain very stable when analyzing chronic conditions and exercise behavior and accounting for differences in time preferences. No support is thus found for the Fuchs hypothesis that unobserved time preferences explain the correlation between education and health. The possibility of course remains that my proxies for time preferences are poor, but in that respect, it should be noted that two out of the three measures of time preferences are significantly related to the health outcomes in the expected direction.Footnote 19

Parent–child interactions   Finally, I consider how differences in parental treatment within twin pairs affect my results. Lundborg (2010) showed that variables measuring time and attention given by the mother and the father were negatively and significantly related to schooling within twin pairs. Since it is difficult to believe that more parental time and attention will result in less schooling, one interpretation of these results is that parents try to compensate for differences in ability or other traits between the twins by giving the weaker twin more time and attention. If such compensating behavior is present, and unobserved, it would imply that twin-based estimates of the associations between education and health (and wages) may be downward-biased. It thus remains to settle how my estimates are affected by accounting for such differences in treatment. Table 7 shows the results when I rerun the regressions with and without controls for parental treatment.Footnote 20 It should also be noted that the sample sizes are again reduced due to missing observations in the parental treatment variables. I therefore again start with the results for this reduced samples, without controlling for parental treatment. As shown in Table 7, the results for self-reported health are still significant and positive and are rather similar in magnitude to the results shown in Table 2. The results then do not change to any important extent when accounting for parental treatment, as shown in columns 4–6. For chronic conditions, the results are insignificant in both specifications but the coefficients are more or less the same. Finally, for exercise behavior, the coefficients remain essentially constant when accounting for parental treatment. In sum, parental treatment in terms of time investments, which has been shown to affect schooling within twin pairs, does not seem to be an important confounder in assessing the relationship between schooling and health. The fixed effects regressions on binary indicators of self-reported health and the regressions on the probability of observing a difference in education are presented in Tables 8 and 9, respectively.

Table 7 Sensitivity analysis

Robustness checks   As mentioned in the methods, the importance of measurement errors in years of schooling may be exaggerated by differencing and even more so when differencing between identical twins (Griliches 1979). The extent of such downward bias may be calculated, however, in the case where one has a measure of the reliability of self-reported schooling and a measure of the correlation in schooling within twin pairs. Previous research suggests that the reliability of self-reported schooling is about 90 %, a figure that has been remarkably stable across studies (Card 1999). Moreover, the correlation in schooling within identical twin pairs is commonly found to be about 0.75 (see, e.g., Ashenfelter and Rouse 1998). Taking these estimates together, an attenuation bias of about 30 % is typically obtained.

To obtain an estimate of the reliability ratio, previous studies have exploited data where several measures of the schooling of the respondent are given. Often, this has been a measure given by the co-twin (see, e.g., Ashenfelter and Rouse 1998). While I do not have access to such a measure, I do have a second measure of the respondent’s schooling at the follow-up survey in 2004. The correlation between these measures would suggest a reliability ratio of 0.90, being very much in line with previous estimates.Footnote 21 , Footnote 22 The estimated correlation in years of schooling within twin pairs in MIDUS is 0.72, which is also rather similar to the figures obtained in previous twin studies, such as that of Ashenfelter and Rouse (1998) and Amin et al. (2010). Taken together with the estimated reliability ratio, this indicates that the twin FE estimator is biased downward by about 36 %. Assuming reliability ratios of 0.85 or 0.95 instead, the downward bias would be 53 and 18 %, respectively. This suggests that the reason why my estimates for years of schooling are not significant may have something to do with measurement error problems.

Griliches (1979) showed that the downward bias in the estimated returns to education becomes more severe the greater the correlation in education is between the twins. Since this correlation is stronger for monozygotic (MZ) twins than DZ (dizygotic) twins, it therefore becomes problematic to compare the effects of education on health for these two groups, as done by Fujiwara and Kawachi (2009), for instance. The reason is that any difference could simply reflect the more severe downward bias in the results for the MZ twins compared to the results for the DZ twins. One could therefore not interpret any smaller or nonexisting effects among the MZ twins in comparison to the DZ twins as evidence that genetic factors were responsible for the significant effect obtained among the DZ twins, as done by Fujiwara and Kawachi (2009). Therefore, I do not perform any analyses using the sample of DZ twins.

For the dummy variables indicating schooling categories, measurement errors are nonclassical. The reason is that individuals in the lowest category cannot underreport the education level, whereas individuals in the highest category cannot overreport (Aigner 1973). With nonclassical measurement error, one cannot generally sign the bias in the estimates. It should be noted that degrees are, in general, much more accurately reported than years of schooling though (Kane et al. 1999).

One source of measurement error will be respondents who are students at the time of the interview, as argued in Section 3. I therefore reran my analysis on self-reported health, this time excluding the seven twins who were part-time students at the moment of the interview. This hardly changed the point estimates at all, and they were still significant (results available on request).

6 Conclusion and discussion

In this paper, I show that relative to high school dropouts, people with greater schooling are significantly healthier, as measured through self-reported health and chronic conditions, and perform exercise more often. Beyond completing high school, however, additional schooling does not generate any additional health gains. These results were based on a twin-differences design, netting out the influence of genetics and family endowments. When measuring schooling through years of schooling, point estimates were similar in both the twin-differences model and the OLS model for self-reported health, but insignificant in the former case.

While I found schooling to be significantly associated with physical exercise, no corresponding effect was obtained for two of the most common causes of preventable deaths in the USA; smoking and body weight. This would suggest that the association between education and these latter two factors may arise mainly through the influence of unobserved family endowments. In addition, this would suggest that the effect of education on health does not mainly arise through body size and smoking. Instead, other factors, such as better access to the healthcare system, better access to health information, and better adherance to health treatments, may explain the positive effect of education. It should be noted, though, that the there is rather mixed evidence to date regarding the causal effect of education on smoking and body size, and more research is clearly needed.

My twin-differences estimates would still be biased if there are twin-specific unobserved factors that relate to both schooling and health. I therefore exploited the rich and unique MIDUS data on the early life conditions of twins and showed that my results were relatively robust to accounting for some of the factors that has been shown in a previous study to predict schooling differences in twins, such as parental treatment. For chronic conditions, however, the results were weakened when restricting the sample to twins with very similar early life conditions. On the other hand, I also showed that the effect of schooling on self-reported health, chronic conditions, and exercise behaviors did not change much when I accounted for differences in attitudes towards the future within twin pairs. Thus, I found no support for the famous Fuchs (1982) hypothesis.

The finding that stands out clearest in the paper is the effect of education on self-reported health. This result survived all sensitivity checks, and the fixed effects results even exceeded the cross-sectional OLS estimates. This begs the question of why the strongest result is found for self-reported health? First of all, note that self-reported health is the broadest outcome measure of the ones used in this paper. Yet, there is consensus that the measure is informative, in the sense that it is a strong predictor of later-life mortality and morbidity (Idler and Benyamini 1997). This suggests that the other health measures, such as chronic conditions and BMI, to a lesser extent, capture dimensions of health affected by education. Also, certain measures reflected health inputs rather than health outcomes, such as smoking. This suggests that smoking as a health input is not affected by education in a causal sense. On the other hand, the finding that education affected exercise behavior in a positive direction suggests that part of the effect of education runs through the choice of health inputs rather than only through more efficient use of given health inputs. Yet, the effect of education on self-reported health was only marginally affected when controlling for exercise behavior. There may of course exist other important health inputs that are not covered in the data available in MIDUS and that would have explained why education has such a strong effect on self-reported health.

The subjective role of self-reported health may also play a role in explaining the strong effect of education on self-reported health. Note that among the various outcome measures used, self-reported health is the most subjective one. If twins tend to compare their self-reported health to each other and have similar views of what self-reported health means, this will make the subjective nature of self-reported health less problematic compared to the case when self-reported health is compared between unrelated individuals.Footnote 23 This will also serve, paradoxically, to reduce the influence of measurement errors involved in the measure of self-reported health. It seems plausible that this is also one of the explanations why the twin-fixed effects estimates for self-reported health come out as greater in magnitude than the corresponding cross-sectional OLS estimates.

To conclude, my findings provide some evidence consistent with the idea that schooling has a causal effect on health. This is in line with the results from a number of recent studies, using alternative research designs, such as IV, and examining various health outcomes, such as mortality, child health, and self-assessed health (Currie and Moretti 2003; Lleras-Muney 2005; Oreopoulos 2006). However, my findings at the same time differ from some of the results obtained in recent twin studies. In Fujiwara and Kawachi (2009), for instance, no significant effect of education on self-reported health was obtained. They only considered years of schooling, however, whereas my results suggest that there may be important nonlinearities in the relation between education and health. My results also differ from those of Behrman et al. (2011), for instance, where no significant effect of schooling on mortality or hospitalizations was obtained in a Danish context. The difference in results may, however, reflect differences in the institutional context between Denmark and the USA and in the outcome variables studied. For instance, in Denmark, there is universal health insurance coverage, which is not the case in the USA.Footnote 24 Also, both mortality and morbidity constitute hard end points and may not capture the same elements of health as those captured in other health measures, such as self-reported health. Finally, I found no significant effects of education on BMI or overweight, whereas Webbink et al. (2010) found a significant and negative effect among males.

The recent findings of a possible causal effect of education on health are very relevant for the current policy debate about the future of the health care systems. If schooling has a causal effect on health, policies that strengthen the incentives to obtain a higher education may have beneficial effects for both the productivity of nations and for population health. My results suggest that that policies that encourage high school completion may have particularly beneficial effects. Such policies may include targeted interventions towards students lagging behind and a more generous student loan program in order to lessen the financial burden on lower- and middle-income families.