1 Introduction

Recent work on labor market choices has shown that same-sex couples often make different human capital decisions that reflect their lifestyle, gender composition, and procreative constraints.Footnote 1 This nascent economic work, however, seldom directly relates to family or marriage behavior.Footnote 2 Given the obvious difference in the ability to directly procreate between opposite and same-sex couples, and given the complementarities between marriage and children, it seems intuitive that there should be differences in matching, reasons to marry, frequency of marriage, behavior not complementary with children, and the presence of children.

We model different procreation and childbearing costs across sexual orientations, predict differences in the aforementioned behaviors, and confirm the predictions with high quality data. While some examined differences in behavior are anecdotally well-known or have been found in other small sample work, we make three major contributions. First, we show that a wide range of differences in behavior can hinge on differences in the cost of procreation and child raising. Other theories based on thin markets, preferences, or stigma can each explain some, but not all, of the behaviors that we examine. Second, we document that matching behavior differs across sexual orientations. To our knowledge, no other model predicts the specific matching behavior that we identify.Footnote 3 Finally, we test our model with two large Canadian probability samples.

Because same-sex couples are unable to procreate, they must engage in some type of more expensive procedure to acquire children when they want them. The current channels by which a same-sex couple must either conceive, adopt, or otherwise acquire children, are all considerably more costly than heterosexual sex. There is a large literature spread across several fields (law, tax, gender studies) supporting this cost difference. For example, cost differences arise over the need to find and contract with third parties, the expense of artificial reproductive technologies, discrimination over access to technologies, legal and social barriers to adoption, and differences in tax deductibility considerations for same-sex couples.Footnote 4 Once all of the legal and social hurdles have been met, the specific costs are still considerable. At the low end, artificial insemination for lesbians costs around $1000 per trial, with a 5–25 % success rate—more if fertility drugs are used.Footnote 5 At the higher end, surrogacy for gay couples can cost in the tens of thousands of dollars.Footnote 6 Throughout the paper, we recognize and exploit the fact that these costs are greater for gays compared to lesbians.

There are also differences in the costs of raising children, due to some or all of the following: discrimination and stigma, bonding and affinity issues resulting from lack of biological connection, social disapproval, behavior problems arising over knowledge of having been donor inseminated, or the imperfect substitutability of “mothering” and “fathering.” This cost claim is strongly supported by the literature, and the asymmetric list of issues faced by lesbian and gay couples is long.Footnote 7

A higher cost of children means that: (1) fewer should be demanded by same-sex couples, even if they have the same desire for children as opposite-sex couples, and (2) these couples have less incentive to avoid behaviors, lifestyles, and social capital investments that are not complementary with children, and thus should engage in them more frequently. Moreover, due to differences in the ways in which children are acquired, all else equal, heterosexuals should place greater value on the quality of their partner’s inheritable traits than gays and lesbians, which should lead to (for heterosexuals relative to gays and lesbians): (3) a stronger positive relation between these traits and the probability of marriage, and (4) more pronounced assortative matching along these traits. In short, the marital behavior of same-sex couples is likely to be different from opposite-sex couples because the shadow prices they face are different, and these same differences should exist between gay and lesbian couples for the same reason.

Given that the “costs of children” are not directly observable in our data, one might object that our results are driven by differences in preferences for children, marriage, and other elements of relationships rather than by such costs. Our goal is to show that our model is consistent with a wide range of phenomena, even if one assumes that everyone has the same preferences.Footnote 8 Preferences may not be the same across orientations, but we simply argue that even if they were the same and even though both same-sex and opposite-sex couples are present in the marriage market, there should be differences in their family behaviors given the different costs of having and raising children.

We use two large data sets to analyze different types of households in terms of their potential marriage behavior. The first is the Canadian Community Health Survey (CCHS), which is a large, nationally representative, probability sample of Canadian households that self-identifies sexual orientation. These data have excellent measures of health, and allow for the direct identification of gay and lesbian individuals (single and married).Footnote 9 Despite its advantages, the CCHS is unable to test our matching hypothesis because the information it contains on the respondent’s spouse is too limited. Hence, we also use the 2006 Canada Census, which contains a large 20 % random sample of the population that self-identifies same-sex couples. The overall evidence of differences in behavior between gays, lesbians, and heterosexuals is strongly consistent with our model.Footnote 10

We present our model in Sect. 2, and discuss our data and empirical results in Sect. 3.

2 A model of marriage, children, and matching

We present a matching model where the key feature is a difference in the cost of children across three different sexual orientations: heterosexual, gay, and lesbian.Footnote 11 We assume that members of each group only match with members from the same group, all pregnancies are planned, and only couples have children. Individuals are initially randomly paired in a “date” and incur a search cost \(k\,{>}\,0\). Later they decide if they want to be a couple, and once a couple, they decide if they want to marry and/or to have children. Individuals can reject a date and go back to the dating pool, but once a person is coupled, they remain so. In addition, we assume that spouses have the same preferences over children and marriage and that there are no transfers, which allows for an abstraction of bargaining issues within a household and allows a focus on cost differences in conception and child rearing between the different couple types.

Every type of individual is described by two traits \((g_i,h_i)\) distributed according to a positive density on \([0,G]\,{\times }\,[0,H]\).Footnote 12 The trait \(g_i\) is a quality index related to biological reproduction: it accounts for expected longevity, health, fertility and other features that could be passed on to children, and also accounts for an individual’s reproductive fitness. The trait \(h_i\) is an index of characteristics such as education, talent, etc. that produce non-child household production. A component of every potential match payoff is \(h_{ij}\,{=}\,m(h_i,h_j)\), where m is increasing in \(h_i\) and \(h_j\), that measures the utility of household production independent of children or marriage. We do not assume that g is independent of h. Therefore, our model could easily accommodate characteristics that contribute to both traits: for example, intelligence may be passed on, and may also produce non-child related household production.

2.1 Stage play

There are three stages of play to the matching game.

  • Stage 1: Singles are randomly paired in a dating market at cost \(k\,{>}\,0\).

  • Stage 2: Each person i decides if he wants to break up after observing the other person’s \(g_j\) and their \(h_{ij}\). Coupling is mutual, so either can break the date and return to stage 1.

  • Stage 3: Each couple \(c_{ij}\) that remains together now observes \(\epsilon _c\,{\in }\,\mathfrak {R}\), their suitability for marriage, and decides whether to marry, and whether to have children.Footnote 13 We assume that \(\epsilon _c\) is independent of all other variables and distributed according to a continuous cumulative distribution function. We do not, however, constrain its sign: some couples may prefer the status and institutional protections of marriage, while others may prefer the flexibility of remaining unmarried.

Person i has the following separable utility function over children and household production when matched with person j:

$$\begin{aligned} U_{ij}=\, & {} v_{ij} +h_{ij}, \nonumber \\=\, & {} \left\{ \begin{array}{ll} [\gamma {(g_i,g_j)} +M_c -s-f_c+\epsilon _cM_c] + h_{ij} & {}\quad \text { if \ children} \\ \epsilon _cM_c+h_{ij} &{}\quad \text { if \ no\ children} \\ \end{array}\right. \end{aligned}$$
(1)

Note that \(v_{ij}\) captures the utility related to marriage and children, and \(h_{ij}\) captures all other household utility. The sub-utility function \(v_{ij}\) has a number of components. First, \(\gamma {(g_i,g_j)}\) is the expected utility of children, conditional on the biological attributes of the couple. We assume that \(\gamma (g_i,g_j) \,{=}\, \mathrm{max}\{a, b(g_i,g_j)\}\), where a is the expected utility from adopting, and b is the expected utility from having own biological children.Footnote 14 For same-sex couples, option b is unavailable—this is not a critical assumption.Footnote 15 We assume b is increasing in both arguments; that is, \(g_i\) and \(g_j\) both improve child quality.

Second, \(M_c\) is an indicator variable for being married. Marriage is understood to be an institution complementary to children, and we normalize the value of this increase to 1.Footnote 16 The variable s is the value of activities foregone due to the presence of children. This is the value of behaviors that are not complementary with children and which are sacrificed when children arrive. Finally, our critical variable is \(f_c\), the cost of having children. These costs are assumed to be zero for opposite-sex couples, for whom children are a by-product of sex. Importantly, these costs are positive for same-sex couples, and vary across gay and lesbian couples. Lesbian couples might have to engage in costly insemination procedures, but these are less expensive than surrogacy. Therefore, we assume that the costs of children are related to sexual orientation; that is, \(f_{{ Gay}}\,{>}\, f_{{ Lesbian}}\,{>}\,f_{ {Hetero}}=0\).Footnote 17

Table 1 shows the four possible utility outcomes once a pairing decides to be a couple (subscripts have been suppressed).

Table 1 Individual marriage and child payoffs

No outcome dominates the others, and which outcome is chosen depends on the couple’s specific values of the various utility components. The difference in utility between cohabitation with children (option (C)) and marriage with children (option (D)) is \(1+\epsilon _c\). The utility difference between cohabitation and marriage without children is just \(\epsilon _c\). These values may be greater or less than zero depending on the couples suitability for marriage. The difference in utility between married couples with children and married couples without children is \(\gamma +1 -s-f_c\), which can also be greater or less than zero. As a result, different couple combinations will choose different outcomes with respect to marriage and children.

In order to save space, below we present our propositions as intuitively as possible. The full model, formal results and proofs are available in an online appendix.Footnote 18

2.2 Incentive to marry and have children

The model generates one proposition and four corollaries with respect to marriage and children that are quite intuitive.

Proposition 1

Same-sex couples are no more likely to marry than heterosexual couples, and they are strictly less likely to do so than a heterosexual couple with biological traits \((g_i,g_j)\) when \(s>a-f_{{ Lesbian}}\) and \(s<\mathrm{max}\{a, b(g_i,g_j)\}+1\).Footnote 19

Corollary 1

Lesbian couples are at least as likely to marry as gay couples, and more so if \(a-f_{{ Gay}}<s<a+1-f_{{ Lesbian}}\).

Corollary 2

Heterosexual couples are at least as likely to have children as lesbian couples, which are in turn at least as likely to do so as gay couples. These relations are strict as long as \(a-f_{{ Lesbian}}<s < a+1 -f_{{ Lesbian}}\), so that some, but not all lesbian couples adopt.

Proposition 1 and the first two corollaries can be understood from Table 1. Consider an increase in \(f_c\) to \(f^\prime _c\), all else equal. Couples who would have chosen no children before the change do not change their behavior because \(f_c\) is not in their payoff function. Couples who would have chosen children with cohabitation reveal that \(\gamma -s-f_c \,{>}\,0\), and an increase in \(f_c\) means that those at the margin will now decide to have no children. Finally, couples who would have chosen marriage with children under the original cost will now continue with (D) or choose (A) or (B), depending on how close they are to indifference between having children or not.Footnote 20

In other words, an increase in the cost of children leads some couples to forego having children, and some of these couples will also forego marriage as a result. No couple changes its decision in the opposite direction. Since \(f_{{ Gay}}\,{>}\,f_{{ Lesbian}}\,{>}\,0\), all other attributes equal, gay couples should be the least likely to have children and marry because \(f_c\) is greatest for them, followed by lesbian couples, and finally heterosexual couples. Two other intuitive results follow from the model:

Corollary 3

Gay couples are at least as likely to engage in behaviors not complementary with children as lesbian couples, which in turn are at least as likely as heterosexual couples to engage in non-complementary behaviors.Footnote 21

Corollary 4

Suppose couples A and B have biological traits \((g_i,g_j)\) and \((g_{i}^{\prime },g_{j}^{\prime })\), respectively, with \(g_i>g_{i}^{\prime }\), \(g_j \ge g_{j}^{\prime }\), and \(b(g_i,g_j)>a\). Then if the couples are heterosexual, couple A is more likely to marry than couple B, while if they are same-sex, they have the same probability of marriage.

Corollary 3 follows from Corollary 2, while Corollary 4 follows because, if the couples are heterosexual, couple A has a higher \(\gamma\) than couple B, which is equivalent to a lower f.

2.3 Matching behavior

To close the matching model, we assume that individuals leaving the dating pool are replaced by individuals with the same characteristics.Footnote 22

2.4 Same-sex matching

Given that same-sex couples cannot procreate together, their biological traits cannot be complementary, and are not considered in matching.

Proposition 2

In any equilibrium for same-sex couples, for any \(h\in [0,H]\), the set of partners that would accept type (gh) as a mate, as well as as the set of partners that are acceptable to type (gh) as mates, are independent of g.

Therefore, conditional on \(h_{i}\), the expected g of individual i’s partner in a same-sex couple is independent of \(g_{i}\). In other words, the biological fitness of same-sex partners should be uncorrelated, conditional on h. This will not be true in general for heterosexual couples because their biological fitness is passed on to their own offspring.

2.5 Heterosexual matching

For heterosexual matching, we assume that b(., .) is super-modular; that is, if \(g_{i}\,{>}\,g_{i}^{\prime }\) and \(g_{j}>g_{j}^{\prime }\), then \(b(g_{i},g_{j})+b(g_{i}^{\prime },g_{j}^{\prime })\,{>}\,b(g_{i},g_{j}^{\prime })+b(g_{i}^{\prime },g_{j})\).Footnote 23 For example, individuals with high g may place greater value on their children having high g because they do not want their children to face difficulties that they did not face. Alternatively, if parents are risk-averse with respect to the quality of traits passed on to the child, then it is more important for an individual with high g to have a partner with high g (so that good traits are passed on for sure) than for an individual with a low g.

Under the above assumption, heterosexuals with higher g are more selective than individuals with lower g when considering partners with low g. This points to assorting along the g dimension for heterosexual matching, and leads to the result below. To avoid confusion between specific own type and a generic partner’s type, we denote the partner’s type as (xy), where x is the biological trait, and y is the household trait. Moreover, we assume that for every \(g_{i}\), \(v(g_{i},G)-v(g_{i},0)\,{>}\,0\). That is, every heterosexual cares about the biological trait of their partner at least to some extent.

Proposition 3

There is weakly positive assortative matching in g for heterosexual couples, in the sense that, in equilibrium, for any h and whenever \(g\,{>}\,g^{\prime }\), when considering partners with sufficiently low x, type (gh) is no less selective than (that is, require y as least as high as) type \((g^{\prime },h)\) is, and is strictly more selective if the probability of having biological children is positive.

To understand the intuition for this result, consider an indifference curve for types (gh) and \((g^{\prime },h)\), where \(g\,{>}\,g^{\prime }\), plotted on a plane with the partner’s biological trait on the horizontal axis and the partner’s household trait on the vertical axis in Fig. 1.

Fig. 1
figure 1

Example of matching boundary conditions

Consider the indifference curve for either “high” type (gh) or “low” type \((g^{\prime },h)\). For low values of x, the benefits from adopting children are greater than from procreation, and the indifference curves remain flat: the biological trait x has no value in this region. Once \(b\,{>}\,a\), the indifference curves start to fall because one type is willing to substitute x and y in a partner. This occurs sooner for high type (gh) than for low type \((g^{\prime },h)\) since \(g\,{>}\,g^{\prime }\). Moreover, whenever the marginal utility of x for high type (gh) is nonzero, it is greater than the marginal utility of x for low type \((g^{\prime },h)\). Hence, whenever it is not flat, high type (gh)’s indifference curve at a given (xy) is strictly steeper than low type \((g^{\prime },h)\)’s at the same (xy).Footnote 24

Consider a potential mate given by point 1 in the graph. High type (gh) would reject this person as a match because they fall below their indifference curve boundary. On the other hand, low type \((g',h)\) would accept this person as a mate. Now consider another potential mate given by point 2. This person has a higher x and lower y than the person at point 1. Now high type (gh) finds this person acceptable, while low type \((g',h)\) does not. The reason is the supermodularity of expected utility in the biological traits: high biological traits matter more to high type (gh) than to low type \((g',h)\). Figure 1 is one example of the how the boundary conditions might look for types (gh) and \((g',h)\). There are actually three cases, and each is considered in the online appendix. All cases generate the result.

Our model thus predicts differences in behavior without positing any difference in preferences, marriage market conditions (thinness), costs of marriage, or type distributions across sexual orientations. Instead, they all occur due to simple variations across orientations in the availability of means of conception and/or the cost of having children. More complicated models are possible, but our goal is to examine if differences in the costs of children can explain a wide range of differences in behavior between couples of different orientations.Footnote 25

3 Empirical results

3.1 The data

Data for most of our tests come from years 2005, 2007–2010, 2013, and 2014 of the restricted CCHS master files—a probability sample survey with a cross-section design.Footnote 26 The target population of the CCHS is all Canadians aged 12 and over, it covers 98 % of the provincial populations, and data is collected through computer assisted interviews. Given the cross-section structure, our data is suitable for finding the correlations predicted by our model, but not for establishing causal linkages.

The CCHS has extensive information on the respondent, but only limited information on all other members of the household. What makes it particularly unique for a large probability sample is that it self-identifies sexual orientation—heterosexual, gay, lesbian, and bi-sexuality—for all individuals, including singles. Some might critique direct self-reporting of sexual orientation on the grounds that some individuals are unwilling to reveal such sensitive information; however, self-reporting is better than the alternatives, and the CCHS has some additional advantages. First, it does not indirectly identify same-sex couples through responses to a series of questions. Such methods fail to identify gays or lesbians who are single, fail to distinguish bi-sexual individuals, are subject to the same under-reporting problem, and have the added problem of capturing large numbers of heterosexual couples who incorrectly record the wrong sex.Footnote 27 Second, the CCHS’s refined identification of bi-sexual individuals is helpful for reducing measurement error in identifying gays and lesbians.

Finally, the CCHS data are from Canada, where one could argue there has been little official discrimination against same-sex couples for some time: same-sex couples have had all taxation and government benefits since 1997, and same-sex marriage has been legal since 2001–2005.Footnote 28 Other social scientists have noted that legalization has reduced the stress and stigma of homosexuality in Canada, which makes it more likely that respondents would answer questions honestly.Footnote 29 All things considered, the CCHS is an excellent large, random sample data set available to study non-heterosexuals.Footnote 30

3.2 Basic demographics

Table 2 shows some estimated population characteristics for the three household types in Canada.Footnote 31 The numbers are consistent with other findings based on same-sex couples from nonrandom samples. The facts are also consistent with our model predictions.

Table 2 Population estimates of household characteristics weighted observations

Table 2 shows that gays and lesbians make up tiny fraction of the population, and, as in several other data sets, there are more gays than lesbians.Footnote 32 Overall, 43.2 % of lesbians are single, and only 14.5 % of them are married. Gay men are much more likely to be single (59.7 %), and less likely to be married (only 9.8 %); the opposite is true for heterosexuals (29.2 and 48.4 % respectively). The percentage of households with children under 18 is 10.3% for lesbians, only 2.7 % for gays, and 24.7 % for heterosexuals.

In terms of income, the CCHS confirms findings from other studies: gay and lesbian households do not appear to suffer any household income penalty.Footnote 33 Heterosexuals, despite their average incomes, are more likely to own their home compared to all other groups, and heterosexuals are more likely to report no health problems. Lesbians are more likely to be white, and on average both gays and lesbians are considerably more educated than heterosexuals. Heterosexuals are less likely to be smokers, on average. However, perhaps the most striking difference is with respect to sexual behavior. In this regard, lesbians and heterosexuals appear quite similar on average: 85.29 and 83.22 % had only one sexual partner in the past twelve months, and around 3 % of them had more than four. In contrast, gays are much less likely to have one sexual partner in the past twelve months, and much more likely to have had more than four (20.6 %). All of these unconditional averages are qualitatively consistent with our model.

3.3 Children, behavior, and marriage

Our model makes a series of predictions regarding family behaviors for heterosexuals, gays, and lesbians. In Table 3, we present the results of five logit regressions, estimated together as seemingly unrelated regressions, to test these predictions.Footnote 34

Table 3 Children, non-complementary behaviors, & marriage SUR Logit Regressions

3.3.1 Presence of children

Our model predicts that children are least likely in gay households and most likely in heterosexual households. The summary statistics in Table 2 confirm this: children are rare among gay and lesbian households without controlling for household characteristics. Table 3, column (1) confirms the findings from Table 2. These logit regression results are based on the full sample, using full controls, robust standard errors, and regression weights, where the dependent variable is whether or not a child under 18 is present in the household.Footnote 35 Although both types of households are less likely to have such children present, there is a considerable difference between gay and lesbian households. Looking at the odds ratio, the coefficient for gays means that the odds of having children present in the home are almost 20 times smaller compared to heterosexual homes. On the other hand, the odds of lesbians having children are only 42 % as large as those for heterosexuals. This difference is consistent with our prediction that non-heterosexual households are less likely to include children, and that this effect is stronger among gay households compared to lesbian households.

3.3.2 Behaviors non-complementary with children

Table 3 investigates a series of behaviors that most would consider non-complementary with the presence of children: smoking, illegal drug use, and sexual activity with more than four partners in the past year. Columns (2)–(4) contain selected coefficients from three regressions, depending on the dependant variable. Each column reports the regression results for the full sample when all controls, weights, and robust standard errors are used.

Our model predicts that the presence of children should reduce the frequency of these behaviors, and we find that the presence of children is associated with less smoking, less illegal drug use, and a lower likelihood of having many sex partners in the past year, for all three sexual orientations. Furthermore, if gays and lesbians without children are less likely than childless heterosexuals to expect having children in the future, then our model also predicts that: (1) on average, childless gays and lesbians should engage in these behaviors more often than childless heterosexuals; and (2) the presence of children should be associated with a larger reduction in the incidence of these behaviors for gays and lesbians than for heterosexuals, because the difference in expectation between gays and lesbians that have children and those that do not is larger than the difference for heterosexuals. Columns (2)–(4) show strong support for these predictions as well.

Taken together, the results from columns (2)–(4) show that childless gays and lesbians more frequently engage in the three examined types of “non-family-friendly” behavior, relative to childless heterosexuals.Footnote 36 However, the results show that while the presence of children is associated with a lower prevalence of these behaviors for all sexual orientations, this “child effect” is generally larger for gays and lesbians.

3.3.3 Probability of marriage given health

Our model predicts that heterosexuals with high g’s should be more likely to have children, and therefore, more likely to marry. Fortunately, the CCHS contains excellent information on an individual’s self-reported health status. It provides information on many health problems, but also calculates a health utility index based on vision, speech, hearing, dexterity, cognition, mobility, or emotional disorders.Footnote 37 We use this health index as a measure of biological fitness. Although the health index ranges from negative values to one, we create a health dummy variable that equals zero if the health index is less than one, and equals one otherwise.Footnote 38

Column (5) reports a logit regression on a couple’s choice to marry or cohabit, based on the full sample, robust standard errors, weighted observations, and controls. The reported coefficients are the sexual orientation variables, these variables interacted with the health index fixed effect, and these variables interacted with the child fixed effect. The variables of interest for matching are the interactive terms of sexual orientation and the health index. The sorting hypothesis predicts that the interactive terms should only matter for heterosexuals.Footnote 39

Column (5) confirms the summary statistic findings of the first four tables: lesbians and gay men are significantly less likely to be married relative to cohabitation. In terms of marriage and biological fitness, the health status of gays appears unrelated to marriage, and for lesbians is negative—although both estimates are insignificant. On the contrary, health status matters for heterosexuals. Healthy heterosexuals have odds of marrying that are 47 % higher than the odds for unhealthy heterosexuals, a statistically significant effect that is much larger than the point estimates for gays and lesbians.Footnote 40

3.3.4 Assortative matching

Our model predicts that heterosexual couples should match along biological and reproductive fitness lines (g), holding constant household traits (h). At the same time, same-sex couples should not match along g because these couples cannot procreate. The one weakness of the CCHS data set is that it does not contain health information on the respondent’s spouse. To resolve this we turn to another data set: the 2006 Canada census.

The 2006 Canada census only identifies same-sex couples (both married and cohabitating), but it does contain the same information on each spouse. This information includes a crude measure of health status. In particular, the census asks if the individual has home, leisure, education, work, or other activities limited due to poor health. The question goes on to define poor health as a condition resulting from injury, illness, mental illness, and hereditary diseases. Respondents only have the three options of “never,” “sometimes,” or “often.” As such, the census health measure is a noisy measure of our g parameter. Moreover, spousal health also contributes to h, as it generates well-being independently from reproduction. Therefore, the model does not rule out sorting along the health dimension for same-sex couples. Rather, the prediction is that such sorting should be more pronounced for heterosexual couples due to the importance of biology for the g parameter.

We use the 20 % restricted census master file.Footnote 41 From this file, all couples, either married or cohabitating, were selected. Statistics Canada does not allow the sample sizes to be released; however, the weighted estimates of the population based on this sample are: 19,575 lesbian couples; 23,125 gay couples; 1,296,250 cohabitating heterosexual couples; and 5,920,270 married heterosexual couples.Footnote 42

Table 4 OLS regressions for assortative matching Dependent Variable: Spouse 1 Health

Table 4 shows our simple test of assortative matching. Columns (1)–(4) run our regression selecting samples based on sexual orientation. Column (5) pools the data. The dependent variable is the ordinal health status of the spouse identified as person 1 in the census. This is regressed on the health status of this person’s spouse and a host of controls. In the case of the pooled regression, the health status of person 2 is interacted with sexual orientation. We report the coefficient on the health status of the spouse and one of the controls: the presence of children. The results confirm our hypothesis. Gay and lesbian couples sort less along the health dimension than cohabitating heterosexuals. Heterosexual married couples sort the most strongly along the health dimension than same-sex couples: the differences between the coefficients on the spouse’s health are statistically significant.Footnote 43 Pooling produces identical results. In addition, when children are present in the marriage, the health status of each spouse is higher for heterosexual couples, but not for same-sex couples—again, a finding consistent with our model.Footnote 44

4 Conclusion and discussion

This paper has exploited two data sets that allow for reliable estimates of demographic characteristics of different sexual orientations, and for some investigation of lifestyle choices and mate matching behavior. The model presented ties a series of different behaviors together and shows how they result from a simple and observable cost difference, which leads to heterosexuals having a stronger expectation of children than lesbians and gays do. Other studies, mostly based on small nonrandom samples, have found similar results for fertility and non-family behaviors, but none to our knowledge have tied these together with the matching problem, nor have they used a data set with the qualities of the CCHS.Footnote 45

An alternative explanation for lower marriage rates among same-sex couples (and, if marriage and children are complementary as we hypothesize, the lower prevalence of children in same-sex households) is that, while same-sex marriage is legal in Canada, it may still be stigmatized, or may simply not have reached a steady state. However, our model additionally explains differences in matching behavior. Thus, while a higher cost of marriage for same-sex couples may contribute to some of this paper’s findings, it is neither necessary nor sufficient for explaining all of them. Our contribution suggests that these differences are likely to persist in the long run, even after transitory factors vanish, because they are based on a biological constraint.

One may also worry that our results about behavior non-complementary with children may be caused by a selection effect: (1) people that are comfortable reporting their homosexuality may also be more comfortable admitting to sensitive behavior, or (2) our results stem from older gays who were unable to marry. While we cannot completely rule out either effect, we make several observations that alleviate these concerns. First, same-sex households with children are not more likely than heterosexual households with children in reporting such behavior. Second, our matching results show that differences in non-sensitive behavior occur, so it appears plausible that differences in sensitive behavior would occur as well. Finally, we ran our regressions without the older gays and found similar results.

In the end, we have provided a simple and plausible model of household behavior that explains a wide range of behavior differences across couples of different sexual orientation, which are documented by our empirical work. We do not claim that our model is the only explanation for these correlations, as we have not established causality. However, we have demonstrated that a parsimonious model based on a single fundamental difference—in procreation and childrearing costs—can generate the many differences in behavior that are observed. Other factors may well contribute to the magnitude of these effects, but do not tie all these phenomena together. We leave it to future work to compare the importance of various explanations.