Introduction

Marriage rates have been declining in most developed countries (UN Demographic Yearbook 2006) and have hit record lows in the US and the UK while cohabitation is on the rise (Copen et al. 2013; Rogers 2011). This has been troubling in light of the extensive literature that has suggested that marriage confers benefits to men and women in the form of mental and physical health and longevity (Wilson and Oswald 2005; Wood et al. 2007 provided reviews of the “marriage health premium” literature).Footnote 1 In fact, some empirical evidence has suggested that the effect of marriage on longevity is greater than the effect of income, and for men being married even offsets the negative consequences of smoking (Gardner and Oswald 2004). Therefore, the causal effect of relationship status on health is an increasingly relevant health policy question as changing social norms as well as social and economic instability delay or impair family formation (Ekert-Jaffe and Solaz 2001; Gutierrez-Domenech 2008). If marriage per se is the cause of better health, then these negative trends portend ripple effects for public health, health care spending and health-related labor market productivity.

We have extended research that has attempted to identify the causal link between relationships and health in several ways. First, we have defined the effect of relationship status more broadly to include cohabitation. Although cohabiting partners are co-resident like married couples, the extent to which one cohabiting partner is willing to invest in the health of the other may be weakened by the absence of a legal commitment. Second, we applied the Blundell and Bond (1998) GMM systems estimator (hereafter referred to as BB) for dynamic panel data models. The BB estimator allowed us to control for both the endogeneity of relationship status and the dynamics of health. Third, we generated a continuous health index that aggregated a wide array of health indicators, including self-reported health, reported health problems, disability status and mental health into one continuous variable.Footnote 2 This was particularly important because the BB estimator required a continuous dependent variable. Finally, we used 18 waves of the British Household Panel Survey (BHPS) from 1991 to 2009. Beyond this long time frame and ample data on health and relationships, the presence of the National Health Service in the UK avoided confounding inferences on the effect of relationships on health with access to health care through a spouse and/or employment, as has been the case with studies that have used US data.Footnote 3

The paper proceeds as follows. The next section reviews the theoretical background and empirical literature on the effect of relationships on health. Then we detail our empirical specification, describe the data, present our results, and conclude with implications for policy and future research.

Theoretical Background and Literature Review

Theoretically, there have been two competing, but not mutually exclusive, effects of relationship status on health that have been consistent with a positive correlation between marriage and health. The “marriage market” hypothesis has purported that healthy people are selected into marriage because they make better marriage partners. According to this argument, an observed correlation between marital status and health is not causal—i.e., not a function of marriage per se—but rather a function of the fact that marriage partners are chosen because they are healthier. In contrast, the “marriage protection” hypothesis has suggested that married people are healthier because they have a spouse who can monitor their health behaviors, care for them when they are ill, and discourage them from engaging in risky behaviors such as smoking and drinking (Ali and Ajiloare 2011; Duncan et al. 2006; Lee et al. 2005; Thompson 1994; Waite 1995; Waite and Gallagher 2005). In addition, a spouse can also provide emotional support and act as a buffer during adverse life events such as job loss and illness (Averett et al. 2012; Wood et al. 2007). According to the marriage protection hypothesis, married people are healthier because they are married, and it is precisely this causal effect that we were interested in estimating.

However the existence of the protective effect introduces the possibility of adverse selection: Those in poor health have an incentive to marry. In other words, those who are most likely to benefit from marriage in terms of better health are most likely to marry and least likely to exit the marriage, i.e., they are most likely to “purchase” the marriage protective effect (Cheung 1998; Cheung and Sloggett 1998; Lillard and Panis 1996). Finally, there may be unobservable traits and behaviors that affect both health and relationship status. For example, those who are more patient may be more likely to get and stay married and more likely to start and stay on a diet resulting in a spurious correlation between their health and marriage.

A growing empirical literature has documented a positive effect of marriage on health as noted in the introduction, although not all studies find such an effect. Much of the research in this area has been conducted using US data and has focused predominantly on marriage.Footnote 4 The US tax code and the current US health care system alter the incentives to marry (Alm and Whittington 1999) and the likelihood of obtaining health insurance (Meyer and Pavalko 1996; Murasko 2008; Zimmer 2007).Footnote 5 Thus, estimates of the effect of marriage on health using US data have been specific to this institutional setting. Because of the public provision of health care in the UK, marriage does not alter the likelihood of obtaining health insurance. As a result, in UK data there is no institutionally driven correlation between marriage and health occurring because of differential access to health care. Despite this, there has been surprisingly little work on the effect of marriage on health in the UK. Early work using UK data by Cheung and Sloggett (1998) and Cheung (1998) did not control for unobservables, and more recent work by Cheung (2000) and Wilson and Oswald (2005) focused on mortality rather than health.

Given that increasing numbers of men and women are cohabiting, it is important to understand its evolving effects on health. There has been discussion that in the US the boundaries between marriage and cohabitation have been blurring and that the experiences of marriage and cohabitation may be converging (Cherlin 2004). In the UK cohabitation has typically been short lived, and cohabiters have been less likely to have children compared to married couples (Ermisch and Francesconi 2000; Haskey 2001; Seltzer 2004; Smallwood and Wilson 2007).

Only a handful of authors have empirically investigated the effect of cohabitation on health. Using a cross section of Canadian data, Wu et al. (2003) found that cohabitation may be as beneficial for health as marriage; however with only 1 year of data they could not control for health dynamics or unobservable factors associated with health and relationship status. Also using Canadian data, but with longitudinal data and an individual fixed effects estimator to net out the selection effect, Averett et al. (2012) found that cohabitation is generally better for health than being never married, but it is usually not as beneficial as marriage. Musick and Bumpass (2011), using US data, found that cohabitation is not always distinguishable from marriage in terms of its benefits on wellbeing. Finally, in a reversal of the question but using the same British data we used, Pevalin and Ermisch (2004) examined the effect of mental health on cohabitation and found that poor mental health increased the probability of exiting a cohabiting relationship.

Much of the literature that has estimated the causal effect of relationship status on health has relied on individual fixed-effects (FE) to net out time-invariant unobservable heterogeneity which may affect both health and relationship selection and thereby bias the results (e.g., Averett et al. 2008, 2012). Two notable exceptions have been Ali and Ajiloare (2011) and Lillard and Panis (1996). Ali and Ajilore used propensity score matching to account for the selection into marriage on observable, but not unobservable characteristics. They found that marriage reduced risky behaviors, specifically drinking and drug use for African Americans. Lillard and Panis estimated a system of simultaneous equations involving mortality, health, marriage formation, and marriage dissolution. This method, while appealing because they controlled for selection on observable and unobservable characteristics, hinged on finding instruments that determined marital status but which were unrelated to health. They found that married persons lived longer and that there was adverse selection based on health for men (sicker men remarried more quickly) but positive selection into marriage based on unobservables.

Both the FE models and the method used by Lillard and Panis (1996) required transitions into and out of relationships to identify the effects of relationship status on health. Problematically, in many data sets there is little variation in relationship status over time, and as a result parameter estimates of interest have been identified off of those relatively few who changed relationship status. The health experience of these relationship changers may not be generalizable to the majority of observations who stay in the same relationship. In our data that spans 18 years we observed changes in relationship status in only about 5 % of the sample person–years, and ~75 % of the individuals in the sample never changed their relationship status.

In addition, there has been theoretical justification and empirical support confirming that health is a dynamic process. Yet to our knowledge, the extant literature on relationships and health has failed to consider this. Health dynamics can arise from various theoretical sources including partial adjustment to health demand (Wagstaff 1993), state dependence in the health production function (Kohn and Patrick 2010) or from generally distributed lag effects associated with health shocks. Contoyannis et al. (2004) empirically showed the strong persistence of health in the BHPS. Thus, failing to include lagged health in a model of health as a function of relationship status risks omitted variable bias if, as hypothesized, lagged health is correlated with relationship choice.

Another complication in estimating the effect of relationship status on health has been the measure of health itself. There has been no consensus in the literature on how to measure health. Many studies have used self-assessed health (SAH) on a 5 point scale from excellent to poor. While SAH has been shown to correlate with the probability of death, mortality is just one dimension of health, and not necessarily the most relevant one when considering individual well-being, productivity or the demand for medical care (Contoyannis et al. 2004; Kohn 2012). Moreover, SAH is an ordered discrete variable that raises additional econometric difficulties associated with dynamic non-linear estimation in the presence of unobservable heterogeneity, particularly the incidental parameters problem (see Lancaster 2000 for a review, and Greene 2004 for recent empirical developments). Wood et al. (2007) in their review of the effects of marriage on physical health noted that changes in self-rated health may be a poor proxy for changes in physical health. This could happen if, for example, people tend to rate their health in the same category from year to year, even as their health declines with age as is often the case. The existence of this pattern has suggested that trends in self-rated health may not reflect more subtle changes in respondents’ underlying physical health status, and that a more refined health indicator might be necessary to understand the effects of relationships on physical health. Other studies have used a variety of health outcomes making it difficult to compare the results across studies (see Wilson and Oswald 2005; or Wood et al. 2007 for examples of physical health measures used in these studies).

Model Specification and Identification Assumptions

To examine the effect of relationship status on health, we estimated the following model:

$$ H_{\text {it}} = \beta H_{it - 1} + {\varvec{\beta}}_{R}^{'} {\mathbf{R}}_{it} + {\varvec{\beta}}_{X}^{'} {\mathbf{X}}_{it} + {\varvec{\beta}}_{W}^{'} {\mathbf{W}}_{t} + {\varvec{\eta}}_{it} $$
(1)

where i indexed individuals and t indexed time. H was a continuous measure of health discussed further in the next section. R was a vector of dummies for the relationship states of cohabitating, divorced/separated, never married or widowed with married as the omitted category. Following the “marriage market hypothesis,” we assumed that the observed relationship was pre-determined in part by prior health. X was a vector of additional socio-demographic variables assumed to be either predetermined (income, education, and pregnancy) or exogenous (accidents). W represented a set of dummy variables for the different waves (year fixed effects). The error term \( \eta_{it} \) can be divided into two parts, v i the time-invariant individual unobservable factors that may have been correlated with both health and relationship status and e it the observation specific errors such that:

$$ \eta_{it} = v_{i} + e_{it} $$
(2)

and

$$ E(v_{i} ) = E(e_{it} ) = E(v_{i} e_{it} ) = 0 $$
(3)

The presence of health dynamics, reverse causality from health to relationship status, unobservable heterogeneity and the lack of variation in relationship status led us to the BB estimator.Footnote 6 While the BB systems GMM estimator has been applied most often to macro-economic questions (see for example, Saci et al. 2009) it has also been applied to micro-data to investigate questions regarding personal finance (Fry et al. 2008), R&D (Kumazawa and Gomis-Porqueras 2012) and individual health (Picone et al. 2004). This estimator employs a system of equations as illustrated below:

$$ \begin{gathered} {\mathbf{\Updelta H}}_{{{\mathbf{it}}}} = \beta_{H} \Updelta \hat{H}_{it - 1} + {\varvec{\beta}}_{R}^{'} {\mathbf{\Updelta R}}_{{{\mathbf{it}}}} + {\varvec{\beta}}_{X}^{'} {\mathbf{\Updelta X}}_{{{\mathbf{it}}}} + {\varvec{\beta}}_{W}^{'} {\mathbf{\Updelta W}}_{t} + \Updelta e_{{i{\mathbf{t}}}} \hfill \\ {\mathbf{H}}_{{{\mathbf{it}}}} = \beta_{H} \hat{H}_{it - 1} + {\varvec{\beta}}_{R}^{'} {\mathbf{R}}_{{{\mathbf{it}}}} + {\varvec{\beta}}_{X}^{'} {\mathbf{X}}_{{{\mathbf{it}}}} + {\varvec{\beta}}_{W}^{'} {\mathbf{W}}_{{\mathbf{t}}} + v_{i} + e_{it} \hfill \\ \end{gathered}. $$
(4)

The top equation is in first-differences which eliminates both time invariant heterogeneity as well as all individuals who do not change relationship status. The bottom equation is in levels which includes information from all of the individuals. The lagged dependent variable is endogenous (denoted with a hat “^”) and instrumented with prior lags in the difference equation and prior differences in the levels equation. In addition, the BB estimator also allows modeling relationship status and other covariates such as income and education as pre-determined rather than exogenous as in standard FE estimators. We relaxed the strict exogeneity assumption and instead assumed that:

$$ E\left( {e_{it} \left| {H_{i}^{t - s} ,R_{i}^{t} ,v_{i} } \right.} \right) = 0$$
(5)

This assumption allowed for feedback from lagged health to current relationship status.Footnote 7 The superscript notation t–s for lagged health reflected our finding, consistent with other empirical work (Contoyannis et al. 2004), that the errors in a dynamic health equation retained residual autocorrelation of lag s. The practical implication of this assumption was that the one-period difference in relationship status remained endogenous with the lag of health in the model and thereby was ineligible as an instrument. We have presented a discussion our findings of autocorrelation and the implications for our lagged instruments in the results section.

Thus, the BB estimator has allowed us to incorporate the two sources of potential endogeneity between health and relationship status: unobservable heterogeneity with the fixed effect and health-related selection into relationships with the lagged health variable. There were no plausible external instruments for relationship status, but an important advantage of the BB estimator is that it used internal instruments using lags of both health and relationship status. In the top equation it was important to instrument for health because by construction the lagged dependent variable is correlated with the differenced idiosyncratic error term. This equation in differences was instrumented with lags of levels, which were assumed to be uncorrelated with the contemporaneous shock \( E\left[ {H_{it - s} \Updelta e_{it} } \right] = 0 \). Importantly, lagged values of pre-determined variables will be correlated with the differenced error and thereby not be valid instruments. Moreover, in the differenced equation, those individuals who did not change relationship status from one period to the next did not contribute to the identification of the relationship coefficients. Thus, to bring additional information to aid identification, the BB added an equation in levels. In this equation, it was even more important to instrument for health because the unobservable individual heterogeneity remained. The instruments were lags of differences which eliminated the unobservable heterogeneity and thereby were also assumed to be uncorrelated with the composite error: \( E\left[ {\Updelta H_{it - s} (v_{i} + e_{it} )} \right] = 0;\;E\left[ {\Updelta R_{it - s} (v_{i} + e_{it} )} \right] = 0 \).Footnote 8 Importantly, Blundell and Bond (1998) demonstrated both theoretically and empirically that these differences are informative instruments even if the dynamic process is highly persistent, which in our case is true for both health and relationship status. The intuition here is that for strongly persistent processes such as health, health shocks such as a loss of a limb (or conversely quitting smoking) are what primarily account for the levels of health in the future.

However, shocks do not necessarily become incorporated into the levels immediately. Rather, dynamic models tend to exhibit autocorrelation. Returning to the loss-of-limb example, this health shock may make individuals more susceptible to related health changes (e.g., potential infection, reduced exercise, and emotional strain) for the next year or more. These subsequent but related health changes would induce a correlation in the errors. The degree of autocorrelation suggests the lag length for appropriate instruments (indicated by the “s” subscript in the notation above): AR1 correlation should start with lag t-2, AR2 correlation should start with lag t-3 and so on. In order to determine the appropriate lag to begin instrumenting, we used the Arellano test for autocorrelation and began at the first lag to fail to reject the null of no correlation. All of our specifications failed to reject the null at AR3, indicating AR2 correlation. For this reason we began considering instruments at lag 3 and required at least four consecutive observations to obtain the difference between t-3 and t-4.

As with all instrumental variables estimators, the validity of the BB estimates depended on the exogeneity of the instrument set. We tested this using Hansen and Difference-in-Hansen tests which unlike Sargen tests were robust to heteroskedasticity and autocorrelation. We reported four different tests for the exogeneity of our instruments. First, we reported an over-all Hansen test which tests the null that all of the instruments for both the difference and levels equation are exogenous. Since our research focus was on the coefficients for relationship status rather than lagged health, our primary concern was with the exogeneity of instruments for the levels equation. Therefore, we reported three additional difference-in Hansen tests for the exogeneity of all of the instruments for the levels equation as a group and then separately for the health and relationship instruments.

In a long panel such as ours, there were many lags available for instruments. A potentially important complication was the “too many instruments” problem (Roodman 2009a). While adding instruments would bring more information to bear on the estimates, more instruments would also risk over-fitting the endogenous variables and also weaken the Hansen tests. We addressed this issue by limiting the lags in our instrument set and in some specifications collapsing the instrument matrix [see Roodman (2009a) for an explanation of the collapsing strategy to reduce the instrument count]. We reported both the number of instruments and the ratio of instruments to individuals in the panel. Monte Carlo studies of this issue used N = 100 and a lower bound of instruments at 5 for 5 % of the panel (see Roodman 2009a and references therein). Both our panel size and instrument counts were considerably larger, but the ratio of instruments to individuals was much lower (0.85–3.17 %).

Our primary focus was on the \( \beta_{R} \) coefficients on the matrix of relationship dummy variables. Marriage was the omitted variable thus inferences were made relative to the married state. Therefore, the hypotheses associated with our research question were:

$$ Ho:\beta_{R} = 0;r \in \left\{ {c,d,n,w} \right\}$$
(6)

If the coefficients on the dummy variables for cohabitation, divorced, never married and widowed were individually zero, then this would indicate that these relationship states had the same effect on health as being married. If these hypotheses were rejected, then the signs on the estimated coefficients would suggest whether the different relationship states had a more positive (negative) effect on health compared to marriage.

The \( \beta_{R} \) coefficients were identified as causal effects based on the following identification assumption:

$$ E\left[ {H_{it} \left| {H_{it - 1} ,\nu_{i} ,R_{it} ,{\mathbf{X}}_{it} ,{\mathbf{W}}_{t} } \right.} \right] = E\left[ {H_{it} \left| {H_{it - 1} ,\nu_{i} ,{\mathbf{X}}_{it} ,{\mathbf{W}}_{t} } \right.} \right]$$
(7)

The key assumption was that conditioning on lagged health as well as unobservable heterogeneity and the other parameters in the model made the conditional mean of health independent of the choice of relationship status, R. In other words, conditional on the covariates and fixed effects, the assignment of relationship status was rendered “random” as in a natural experiment or randomized controlled trial. This allowed us to make causal inferences off a linear model where the observed mean was the sum of the conditional mean plus the coefficient on the relationship:

$$ H_{it} = E\left[ {H_{it} \left| {H_{it - 1} ,\nu_{i} ,{\mathbf{X}}_{it} ,{\mathbf{W}}_{t} } \right.} \right] + \beta_{R} ;r \in \left\{ {c,d,n,w} \right\}$$
(8)

Data Description

We used 18 waves of the BHPS from 1991 to 2009. The BHPS is an annual survey of adult (16+) members of households nationally representative of the UK.Footnote 9 The BHPS began with roughly 5,000 households (over 9,000 full adult individual interviews) in 1991 and added several subsamples over the study period: the United Kingdom European Community Household Panel from 1997 to 2001 (waves G through K); the Scotland and Wales Extension from 1999 onward (waves I through R); and the Northern Ireland Household Panel Survey from 2001 onward (waves K through R). Of the full sample of 227,391 person–year responses from 31,329 individuals, we used 219,210 observations from 30,903 individuals with complete health variable information to compute our health index. Our estimation sample of 185,485 observations from 18,342 individuals had complete information for all covariates plus at least four consecutive years of observations. Recall that a minimum of four consecutive observations was necessary for the lagged health instruments. This excluded 25,519 observations from 13,740 individuals of whom more than half appeared in the data for only one wave.Footnote 10

As noted above, the BB estimator required a continuous dependent variable. The BHPS had a rich array of health indicators; but all, including the most commonly used SAH, were discrete categorical variables. An additional complication was that the way individuals answered the SAH question may have compounded the endogeneity between relationships and this measure of health. For example, those who are never married may “justify” their relationship status by reporting better health similar to the justification bias associated with SAH and labor force participation found by Bound (1991). Our use of multiple correspondence analysis (MCA) to combine SAH with other health indicators into a single continuous health index had the added benefit of purging the reference bias and adaptation that have been known to plague the SAH variable (Contoyannis et al. 2004 and references therein; Groot 2000).Footnote 11 See Kohn (2012) for a detailed explanation of this methodology as applied to a health index and Appendix 1 for the health indicators, summary statistics and weights included in our index.

In Tables 1 and 2 we have presented the summary statistics for our analysis separately by women and men and by relationship status as is customary in this literature.

Table 1 Summary statistics for women
Table 2 Summary statistics for men

These unadjusted means were consistent with expectations. The majority of the sample was married. A greater proportion of men than women were married, cohabitating, or had never been married but fewer were divorced and very few were widowed.Footnote 12 Consistent with the literature on SAH, men reported better health than women despite higher rates of accidents (Case and Paxson 2005). The two measures of health, the health index and SAH, exhibited the same pattern across the different relationship categories. Across these unadjusted means we saw that those cohabiting and never married had the highest values of health followed by married, divorced and widowed; however these means were unadjusted for age, and those cohabiting and never married were younger. The health index had a Pearson’s correlation coefficient with SAH (used as a continuous variable) of 0.7323.Footnote 13 The rate of pregnancy was nearly double for women who were cohabiting than for those who were married, but again, these proportions were unadjusted for age. As expected, divorced and widowed women and men had lower household income than those in relationships, but also lower than those who were never married indicating that the decline in income was not merely due to having a single income. Notably, cohabiting women and men had higher rates of college and professional education while never married women and men had the highest rates of high school/vocational education.

Finally, we included the number of activities that individuals reported being active in as a proxy for social networks. It was important to include these controls because such participation may offer another source of social support, or so-called “social capital” that can substitute for the social support from relationships with respect to health (Averett et al. 2013; Couzin 2009; Wu and Hart 2002). If individuals substitute social participation for relationships, then omitting the effect of social participation on health could attribute a greater negative effect of being in a relationship other than marriage or cohabitation. Notably, both women and men reported the lowest rates of participating in activities when in cohabiting relationships.

Given the importance of variation in relationships over time to identify FE models, we have presented the transition matrices for women and men with at least four consecutive observations over 18 years in Tables 3 and 4.

Table 3 Transition matrix of relationship status for women
Table 4 Transition matrix of relationship status for men

These tables showed the strong persistence in relationships from one period to another. Marriage was the most persistent relationship with 97.52 % of women and 98.30 % of men remaining married from one period to the next. The next most persistent was never married with 91.39 and 93.14 % persistence for women and men respectively. These transitions underscored that there has been little variation in relationship status over time in our sample. As is well known, FE models are identified off of only those who change relationship status. Since this is a very small group, the parameter estimates from the FE models are not likely to be generalizable.

Results

We have presented our results in Tables 5 (women) and 6 (men).Footnote 14 In columns 1 and 2 we have reported OLS and FE estimates which have been the standard in this literature. The results for both women and men were consistent with the finding that marriage is good for health—the coefficients on the relationship status variables were all negative and statistically significant. In Appendix 2, we have shown that this same story holds when we used SAH as the dependent variable with the notable exception of divorce which was no longer statistically significant. This could have been due to the justification bias associated with SAH—perhaps divorced women and men were more likely to state they were healthy to justify their divorced status. In both models using SAH and our continuous health index, the sign and significance of the other covariates was consistent with the findings of other researchers and in line with our expectations. For example, those individuals who had higher incomes and more education were found to be healthier. These results from the OLS and FE models which were consistent with much extant literature confirmed that our findings were not sensitive to our use of a continuous health index or the UK setting of our data.

Table 5 Primary regression results: women
Table 6 Primary regression results: men

We have also reported FE models with lagged health, although we recognized the potential for dynamic panel data bias (Nickell 1981).Footnote 15 As we expected, lagged health was strongly significant, and the inclusion of lagged health in these models reduced the magnitude of the effect of relationship status on health. However, the pattern of statistical significance was largely unchanged. We found a similar pattern when we used SAH as the dependent variable (see Appendix 2).

In column 4 of Tables 5 and 6 we have presented the results from the BB model. A substantial challenge in implementing the BB estimator was choosing the appropriate number of lags for instruments. We followed Roodman’s (2009a) prescription for best practices to balance the efficiency from more instruments with the problems associated with using too many instruments, particularly biased estimates and inflated Hansen tests. First, we reported the number of instruments in each model as well as the ratio of instruments to individuals in each panel. While our large N and T resulted in an instrument count which appears large (45 and 148 for women and men respectively) the ratio of instruments to individuals is 0.45 and 1.78 %, which, as we noted earlier, was well below the 5 % minimum ratio of instruments to panel observations in extant Monte Carlo studies in this field which have focused on problems associated with instrument counts that reach 10 % of panel observations (Roodman 2009a). Second, as noted we report four difference-in-Hansen tests for the exogeneity of our instruments: over-all for all the instruments in both the difference and levels equations, for the levels equation only, and separate tests for the subset of instruments for health and relationship status. Critically, all of our specifications had Hansen p-values for the levels equations that exceeded conventional levels of significance yet did not exhibit p-value inflation relative to specifications that we ran with higher instrument counts.Footnote 16

The first important result was that lagged health was consistently strongly significant and of a high magnitude indicative of strong persistence in health for both women and men, consistent with the finding of Contoyannis et al. (2004). Furthermore, comparing the results on the relationship coefficients across the specifications (OLS, FE, FE with lag and BB) we found that including lagged health reduced the magnitude of the effect of relationship status on health for both men and women. This confirmed that lagged health should be included in any analysis of health, but that doing so required attention to the econometric challenges associated with the strong persistence in health and the reverse causality impact of health on other covariates of interest.

Turning to the priority estimates (BB) on the effect of relationship status on health, we found that most of the significant effects for relationship categories disappeared when we more effectively controlled for both unobservable heterogeneity and health-related selection into relationships. Answering the question we posed in our title, it appears that yes, we can just live together. The coefficients on the cohabitation variable were both small and not statistically significant for either women or men. This finding remained in the robustness check where we used SAH as the dependent variable. These results did not suggest that cohabitation had no impact on health, rather that cohabitation had the same impact on health as being married, the omitted category. Accordingly, while relationships could be good for health, it is not some unique, presumably more lasting bond of marriage that provided the key mechanism. The good news is that it appears that even cohabiting couples engage in protective activities that benefit health.

Only divorce had a significantly negative impact on health for both women and men using the BB estimator.Footnote 17 The magnitude of the effect was larger for women, and it was nearly the same magnitude as the OLS estimates.Footnote 18 The negative impact of divorce for women was also apparent using SAH. However, for men in models using SAH divorce was not statistically significant in the BB or the FE estimates. The difference in the effect of divorce on health for men may once again have reflected confounding unobservable justification bias associated with how divorced men answer the SAH question. Our findings with respect to divorce on health were in line with those of previous researchers who have generally found that divorce reduces SAH (Liu and Umberson 2008; Liu 2012).

Never having been married had a negative effect on health for women, but not for men. This negative effect was only in the models using the health index, not those using SAH, again potentially reflecting some justification bias in the SAH measure of health. The magnitude of the impact was less than half that of divorce. Some of the negative impact of never having been married could be associated with splitting up from a cohabiting relationship. The BHPS recorded these individuals as never married rather than divorced, though the impact of the break-up on health may have relevant similarities in light of our finding on divorce. According to the transition matrices reported in Tables 3 and 4, 509 women (5.62 %) but only 314 men (3.97 %) made the transition from cohabiting back to never married.

All of our results controlled for family variables of pregnancy (for women) and the number of children in the household. Consistent with expectations, pregnancy was negative and strongly significant across estimators using both the health index and SAH. It is also reasonable that the coefficients on pregnancy became larger in the models with lagged health since women needed to be in good enough health to become pregnant in the first place. Similarly, the number of children can positively affect both health and relationships. Our finding was consistent with other research that incorporates the presence of children into the marriage/health framework (Wilson and Oswald 2005).

Finally, our results suggested that being widowed did not have as negative an impact on health as one might have expected. For both women and men, the coefficients on widowed were much larger and strongly significant in the FE models, both with and without a lag. Recall that these coefficients were identified only off of those individuals who changed status, in this case became widowed. In the year prior to becoming widowed, a partner was likely to have been a caregiver and hence under significant physical and emotional duress (Schulz and Sherwood 2008, and references therein). For women the coefficient on widowed was not significant in the BB estimates for both the health index and SAH, and it was also not significant in the OLS estimate using SAH. However, for men the results were similar to those for women using the health index, but opposite using SAH: positive and significant in OLS and BB but negative and not significant in the FE models with and without the lag of health. These results might be confounded with the justification and reference point biases associated with SAH. In addition, the estimates on widowed for men may have been affected by small samples. The proportion of widowed women in our sample was more than twice that for widowed men: 498 widowed men with 2,990 observations versus 1,445 widowed women with 10,745 observations.

Conclusion

The political, economic and social norms surrounding marriage have been in flux. Societies have been debating marriage equity for same-sex couples and whether to continue to favor marriage over cohabitation for tax and other legal purposes.Footnote 19 Women’s gains in education and income may have made it more difficult for them to find suitable marriage partners and may have strained traditional marriage roles (Bertrand et al. 2013). And, as noted in the introduction, marriage rates have been declining while cohabitation rates have been rising across OECD countries. We extended extant literature on the causal effect of relationships on health by incorporating cohabitation as a separate relationship category, using an estimation method that better controls for health dynamics and the endogeneity of relationship status from both reverse causality and unobservable heterogeneity, and using a continuous measure of health.

We found that cohabitation was just as good as marriage for health for both women and men when we controlled for health dynamics and endogeneity using the BB dynamic panel data estimator. This inference from the BB estimator was different than that from FE estimators, which suggested that marriage was better for health, consistent with much of the existing literature. Our results provided important evidence in the continuing debate over the effects of marriage on health and corresponding forecasts of policy effects and health expenditures. The good news is that the global trends towards cohabitation over marriage may not foreshadow additional harm to health. Moreover, policies that promote marriage should not assume additional health benefits over cohabitation or being never married; however efforts to prevent divorce may have ripple effects on health.

Our primary finding that cohabitation was as good as marriage for health places renewed emphasis on discerning the mechanisms by which social relations can benefit health. What it is about living with another person that is good for health? Perhaps long-term commitment is still important, but cohabitation is no longer distinguishable from marriage because such commitment is declining in marriage, or increasing in cohabitation. Or, perhaps the mechanism can be found in differences in day-to-day interactions that are similar in marriage and cohabitation rather than some longer-term outlook. Future research with larger and longer datasets can explore whether the effects of relationship status on health change in meaningful ways at different stages over the lifecycle. Doing so may offer additional insights into the mechanisms by which relationships have a positive effect on health.

There are many other research questions on the effect of persistent variables on dynamic outcomes such as health, including health insurance status, education, employment, home ownership, geographic region, or children. These questions may similarly benefit from employing the BB estimator. Interesting future research remains to disentangle the relative contributions of health dynamics, reverse causality and fixed unobservable heterogeneity. Quantifying these different sources of endogeneity can offer additional policy-relevant insights on the dynamic interrelationships between health and relationship status and other socio-demographic factors over the lifecycle.