1 Introduction

During the last decade, there has been an increasing interest in the relationship between marital status, parenthood, and health (see e.g., Schoenborn 2004; Waite and Bachrach 2000; Wood et al. 2007; Koball et al. 2010; Mirowsky 2002; Umberson et al. 2010). This is partially motivated by the recent changes in family behavior that have occurred in the United States and many other Western countries, i.e., increase in cohabitation, delay in marriage, and rise of non-marital childbearing (Cherlin 2005; Schoen et al. 2007). Studies on the United States highlight the positive association between marriage and a various range of health outcomes for both men and women. Married adults are less likely to die in any given period than the unmarried (Lillard and Waite 1993; Dupre et al. 2009; Rendall et al. 2011), they also appear to have better mental health than their counterparts (Lamb et al. 2003; Horwitz and White 1998; Soons and Kalmijn 2009; Meadows 2009) and they are less likely to engage in unhealthy behaviors (Duncan et al. 2006). At the same time, previous results show a positive association between health and age at first birth. The relationship is parabolic for women, with maximum health predicted for mothers who had a first birth around age 30 (Mirowsky 2002).

Most studies examine health differences by marital status in order to identify the causal effect of marriage. Generally, they compare health outcomes of married men and women versus unmarried (or cohabiting) people or they examine the effect of changes in marital status across life course (Nock 1981). Fewer studies adopt a complete life course perspective. The life course paradigm assumes that individuals, as human agents, build their future on the basis of the constraints and opportunities experienced in the past (Elder 1994). The process is iterative and cumulative, since initial advantages or disadvantages often are amplified with time (Giele and Elder 1998). In addition, different life domains are strongly interdependent.

Elder (1985) observes that a trajectory can also be envisioned as a sequence of transitions that are enacted over time. A transition is a discrete life change or event within a trajectory (e.g., from single to married), whereas a trajectory is a sequence of linked states within a conceptually defined range of behavior or experience. Using longitudinal or retrospective data, family trajectories can be described by the complete sequence over time of union status, childbearing and, eventually, work status. Life course scholars stress the importance of the long-term effects of trajectories (Soons et al. 2009), together with other characteristics of life history. Rather than investigating the contemporaneous association between marital status and wellbeing, life course analysis looks at the entire development of family history, i.e., the whole trajectory. Under this perspective, characteristics such as type, number, and duration of unions, or the order of events may have an effect on later health outcomes (Peters and Liefbroer 1997).

This paper investigates the role of family trajectory, i.e., the whole sequence of family events, during the life course of early adults, in shaping their health outcomes. Union formation and childbearing are jointly considered, since the two life domains are highly connected and their intersections may have an effect on health outcomes. This paper is divided in two parts. In the first part, the analysis focuses on transitions and investigates if changes in timing (when events happen), quantum (what and how many transitions), and ordering (in what order) (Billari et al. 2006; Billari 2005), have an effect on the health of young women. In the second part, life course trajectories are classified into six groups representing different ideal-types of family trajectories and the association of these trajectories with health outcomes is explored.

2 Theoretical and Empirical Background

According to the life course health development (LCHD) model, health is the result of a continuous process that develops over an individual’s lifetime (Halfon and Hochstein 2002). In the LCHD model, health is a consequence of multiple factors operating in nested genetic, biological, behavioral, social, and economic contexts. These contexts change during the life course. Therefore, health is seen as an adaptive process, composed by multiple transactions between the contexts mentioned above (e.g., genetic, social) and the biobehavioral regulatory systems (e.g., neurological, endocrine) that define human functions (Halfon and Hochstein 2002). In other words, health is not a static phenomenon. It develops over time and changes as a function of experience. The LCHD model suggests that a person’s health takes on a trajectory that results from the cumulative influence of multiple risk and protective factors during life course. Health, in turn, is a multidimensional concept that encompasses a large array of measures, including behavioral, physical, and emotional outcomes.

The association between family transitions and health is well documented. Changes in the family structure may affect health in several ways. In particular, Wood et al. (2007) distinguish five different health dimensions: health behaviors, mental health, physical health and longevity, health care access and use, and intergenerational health effects. In this paper, since the focus is on young adulthood, only the first three dimensions will be considered. Using a sample of young women in the United States, the consequences of family trajectories on self-reported health, depression, drinking, and smoking behaviors are studied. A large number of works demonstrate that married people are healthier, happier and less likely to engage in health threatening behaviors (for a review, see Wood et al. 2007; Schoenborn 2004).

In the literature, the benefits associated with marriage are generally called the “protection effects” of marriage (Waldron et al. 1996). In their review, Musick and Bumpass (2006) suggest four possible explanations: institutionalization, social roles, social support, and commitment. Marriage is an institution where spouses have defined social roles both inside and outside the household (Gove 1972; Ferree 1990). Moreover, marriage is a source of social support. Spouses provide intimacy, companionship, and daily interaction. At the same time, married people are connected to a larger network (e.g., friends, kin). This enlarges the social capital from which spouses can draw in case of need. Last, the public nature of marriage strengthens commitment and facilitates joint long-term investments, including financial, role specialization, and time spent in the care of young children. Commitment strengthens bonds between partners and serves as a barrier to exit. It is not clear, however, if these benefits are unique to marriage or whether they can be extended to other intimate relationship, particularly cohabitation. Evidence is mixed: Wu and Hart (2002) find no health effects of entering into marriage or cohabitation in Canada. Horwitz and White (1998) find differences in happiness, but no disadvantages in terms of depression. Musick and Bumpass (2006) examine several dimensions of wellbeing including psychological health, social ties, and relationship quality and they do not find significant differences between married and cohabiters. In comparative research using data from 30 european countries, Soons and Kalmijn (2009) find that the cohabitation gap (with respect to marriage) in wellbeing is associated with the degree of acceptance of non-marital unions in the society.

Although there is an extensive literature on the association between marital status and health outcomes, a number of issues motivate the analysis of the entire life course trajectories rather than single transitions in marital status. First, when data on marital status are collected in a longitudinal survey, we often do not consider what happens between the time periods that are taken into consideration. Cohabitation and marriage are not mutually exclusive. In the United States, about half of young adults live with a partner before marrying. For some people, cohabitation is a prelude to marriage or a trial marriage. For others, a series of cohabiting relationships may be a long-term substitute for marriage (Cherlin 2005). Although cohabitation has become common in the United States, it rarely lasts long. About half of cohabitation relationships end through marriage or a breakup within a year (Seltzer 2004; Bumpass and Lu 2000). If we consider only the change in marital status between the two waves of a longitudinal survey, we may ignore possible variations occurring in between. This may lead to considerable bias if the time between two data collection is sufficiently large. For instance, we may not distinguish between an individual married for the first time and another one who remarried after a separation. Also, since many married people experience cohabitation, it may be difficult to separate the causal effect of marriage. Does marriage have a different effect if it is preceded by cohabitation? In this case, does the time of exposure to premarital cohabitation matter?

Second, although longitudinal studies control for selection into family formation, this is generally done looking at single transitions, without taking into consideration simultaneous changes in other life domains. Union status is clearly connected with other events that happen during the life course. Having a child, leaving the parental home, finishing school, and starting to work are strictly connected with the probability to enter (or exit) a union. For example, a couple may decide to marry because of an unplanned pregnancy, or they can decide to postpone marriage until one or both reach economic independence. Since different domains are strictly interlaced, it may be difficult to identify the effect of a single event, such as marriage or entering a cohabitation without taking into consideration other transitions such as childbearing.

Last, standard analyses do not consider variations in the timing, quantum, and ordering of life course trajectories. It is not clear, in fact, how changes in the structure of trajectories affect health outcomes later in life. Most research, in fact, does not take into account when transitions occurs (timing), how many (quantum) and in what order they happen (ordering). Transitions that occur in different periods of life may have a different effect on wellbeing. For instance, age at first union may be associated with health outcomes. Marriages at age 18 and 30 are qualitatively very different. Numerous studies show that individuals who marry at a young age have a higher risk of marital dissolution (Martin and Bumpass 1989; Bumpass et al. 1991; Lehrer 1988; Teachman 2002). At the same time, the sequence of events is relevant to the study of family life course. Does marriage have the same effect on health if it is preceded by the birth of a child? Evidence shows that unmarried mothers fare worse in the marriage market, because they have greater chances of partnering with poorly educated and unemployed men (Ermisch and Pevalin 2005). However, it is not clear if this increases the risk of having worse health outcomes. Moreover, trajectories may be very different in terms of complexity. Some individuals may experience a large number of transitions while others may not. Does stability in family trajectories affect health outcomes? Does the number of transitions matter? Some scholars argue that the overall structure of the life course has changed in profound ways, becoming “de-standardized,” “de-institutionalized,” and increasingly “individualized” (Macmillan 2005; Shanahan 2000; Elzinga and Liefbroer 2007). It is not clear, however, what are the consequences of a de-standardization of family life course.

From a life course perspective, health outcomes are the result of the cumulative influence of multiple risks and protective factors experienced during the life course (Halfon and Hochstein 2002; Ben-Shlomo and Kuh 2002; Oxford et al. 2006; Harris and Eileen 2010). Under this perspective, it is necessary to take into account the whole life course development in order to study effects on health outcomes. On the other hand, taking the whole trajectory as an input in statistical analysis is not straightforward (George 2009; Amato and Kane 2011). In this study, sequence analysis techniques are used to capture characteristics of the family trajectory such as complexity, ordering, and timing. Then, using Optimal Matching (OM) (Abbott 1995; Abbott and Tsay 2000), typical pathways of family formation using clustering techniques are derived from data. Rather than identifying a causal effect of single family transitions, the aim of this paper is to explore associations between health outcomes and typologies of family trajectories. It may be possible, in fact, that certain typologies of family formation are associated with low health outcomes. This is relevant from a policy point of view. The potential benefits of marriage have influenced, at least in part, several US governmental initiatives in recent years that encourage and support marriage (Lichter et al. 2003; Acs 2007). Consequently, this led to a debate on the effectiveness of pro-marriage policies among the scientific community (McLanahan 2007; Amato 2007; Nock 2005). However, the “de-standardization” of life course led to a large variety of patterns of family formation that goes beyond marriage. The study of family trajectories may highlight disadvantaged situations and it may permit design of appropriate interventions.

3 Contribution of the Current Study

The aim of this study is to explore the association between wellbeing and family trajectories from a life course perspective. The main goal is to analyze if there exist particular family trajectories associated with reduction in health status. To evaluate wellbeing, the paper focuses on the analysis of four different health outcomes: self-reported health, depression, and risky behaviors (heavy drinking and smoking). Although the linkage between marital status and health differ by gender (Williams and Umberson 2004; Umberson et al. 2006), the analysis presented in this paper is restricted to women because of substantial data limitations.

A trajectory is defined as the monthly sequence of family states. The state space is defined as follows. For every woman in the sample, information about marriage and cohabiting relationships are collected. Moreover, information about age (in months) at first birth is included. The combination of union status and parenthood gives these six states: Single, Single Parent, Cohabiting, Cohabiting Parent, Married, and Married Parent. Union states are reversible since from cohabitation it is possible to go into marriage or to return to single after a family disruption. Parenthood, instead, is not reversible, i.e., from Single Parent a woman can only go to Cohabiting Parent or Married Parent. The six states configuration follows the work of Schoen et al. (2007), where the authors examined early family transitions using a multi-state life table framework. The order of transitions is addressed in a precise way using a monthly time scale. Unlike Amato et al. (2008) and Amato and Kane (2011), only family events (i.e., unions and childbearing) are taken into consideration to focus on the relationship between health and family trajectories.

Following a life course perspective, the analysis focuses on the association between different types of family trajectories and self-reported health, depression symptoms, and risky behaviors. In the first part of the empirical analysis, variations in timing, quantum, and ordering of family transitions are treated separately. In the second part, family trajectories are classified in homogeneous groups sharing similar characteristics. The effect of selection and confounding variables is considered using appropriate statistical models. In reference to variation of timing quantum and sequences, three different research hypotheses are specified.

H1

Women who have earlier transitions have poorer health outcomes. (Timing hypothesis)

Women who postpone family formation are expected to be more likely to accumulate human capital. Young mothers or young women who enter union have, in fact, less time to accumulate resources that contribute to avoiding poor health and depression (Miech and Shanahan 2000). Education attainment is strongly associated to postponement of family formation. Higher education prevents women from engaging in behaviors that can damage their health. Furthermore, low educated women are more likely to match low educated men with higher probability of being unemployed and with lower income. Last, early marriage and early motherhood are associated with a higher probability of marital disruption that, in turn, is associated with major stress (Ermisch and Pevalin 2005; O’Connell and Rogers 1984).

H2

Women with unstable trajectories have lower health outcomes. (Quantum hypothesis)

Women who experience a large number of transitions are more likely to have less stable unions and may experience more traumas that can be dangerous for health development. This dimension refers to the amount of distinct sub-patterns of family life trajectories, such as intermittent spells of living single and living with a partner, whether married or unmarried (Elzinga and Liefbroer 2007). The lack of stability in family roles may be associated with more stress and less support from others.

H3

Women who have more non-normative transitions experience lower health outcomes. (Order hypothesis)

Family transitions are not qualitatively equivalent. Family transitions recognized by the society as “normative” are not expected to have a negative effect on health. On the other hand, “non-normative” transitions are expected to be associated with poorer outcomes. The concept of “disorder” was introduced for the first time by Rindfuss et al. (1987) in the study of transition to adulthood and parenthood. Individuals have expectations in terms of the role they assume in society. A disordered life course may decrease the probability to achieve desired social roles and may prevent women to fulfill their expectations. Individuals have expectations about the order of life course events, even if sanctions are not applied. In fact, many sociological theories build in an expected ordering of events in the transition to family. For example, first marriage is still sometimes equated with the beginning of exposure to the risk of parenthood. The variable ordering of events in the life course is a contingency of some importance in the life cycle (Hogan 1978).

The second part of the empirical analysis focuses on family pathways. Since the possible combinations of family trajectories are enormous, homogeneous clusters of trajectories are derived from data. The resulting typologies of family pathways describe simultaneously different combinations of timing, quantum, and ordering. In analogy with Amato et al. (2008), family formation is described using typical patterns of formation derived from empirical observation. The advantage of using classes is to reduce the (almost) unlimited number of combinations to a manageable number of groups that can easily be described. Unlike other studies (e.g., Amato et al. 2008), the interest is not on the precursors of different family pathways, but rather on the consequences. Studying the health outcome of family typologies may help highlighting disadvantages among subgroups of population.

4 Data and Methods

4.1 Sample

The data used come from Waves I and IV of the National Longitudinal Study of Adolescent Health (Add–Health). Add Health is a longitudinal sample, nationally representative of US adolescents who were in grades 7–12 in 1994–1995. In the first wave, data were collected through in-home interviews with the adolescent participants and one of their parents. Typically, the parent interview was completed by the biological mother. Adolescents were interviewed again in a second wave 1 year later, in 1996, again in a third wave collected in 2001–2002 and finally in a fourth wave in 2008–2009. At the time of wave IV, respondents ranged in age from 26 to 33 years. Since the goal of this study is to explore the implications of early life course trajectories, the sample is restricted to women who are 30 or older at wave IV. The decision to exclude men from the analysis rests with substantial data limitations. As explained by Schoen et al. (2007) and Amato et al. (2008), there is a systematic misreporting of childbirths in the fertility history modules (refer to the mentioned studies for further details). However, while it is possible to make use of the information in the household roster to adjust omitted fertility data for women (we followed the same procedure described by Schoen et al. in their paper, 2007, p 810), this was not possible for men, so they were excluded from the study sample. Of this sample (n = 2,358), wave IV weights are missing for 101 women. After dropping these cases, the final sample size is 2,259. At the time of the wave IV data collection, 27 % of the women in the sample were 30 years of age, 54 % were 31 years of age, and 19 % were 32 years of age. Using retrospective questions from wave IV, the family biographies of women were reconstructed from age 15 to their age at wave IV. Attrition is controlled using the longitudinal weights available from the Add Health study.

4.1.1 Health Outcomes

The following indicators are created to analyze different aspects of health status, with measures available both at wave I and at wave IV. Measures indicate physical health, mental health, drinking, and smoking behaviors.

4.1.1.1 Self-reported Health

Status of current health was assessed with one question, “In general, how is your health?” (1 = excellent, 2 = very good, 3 = good, 4 = fair, 5 = poor). Health status is therefore expressed in reverse order. Greater values indicate poor health status. The proportion of women reporting poor or fair health status consists of 11 % of the sample.

4.1.1.2 Depression

A measure of depression has been constructed using questions from the Center for Epidemiologic Studies Depression (CESD) Scale (Radloff 1977). In particular, nine questions out of this scale were asked (each based on the frequency of the event during the past seven days): “bothered by things that usually don’t bother you,” “couldn’t shake off the blues,” “felt just as good as other people,” “had trouble keeping your mind on what you were doing,” “felt depressed,” “felt too tired to do things,” “enjoyed life,” “felt sad,” and “felt that people disliked you” (0 = never or rarely, 1 = sometimes, 2 = a lot of the time, and 3 = most of the time or all of the time). When appropriate, the coding was reversed so that high scores reflected high levels of depression. This indicator ranges from 0 to 27.

4.1.1.3 Smoking

Smoking behavior is measured with a dichotomous variable indicating if the respondent has smoked at least a cigarette in the last 30 days. The percentage of women who report to smoke at wave IV is 27 %.

4.1.1.4 Drinking

Alcohol consumption has been measured using this question: Within the last 12 months, on how many days did you drink five or more drinks in a row? Response options were 0 = never, 1 = 1 or 2 days, 2 = once/month or less, 3 = 2 or 3 days/month, 4 = 1 or 2 days/week, 5 = 3–5 days/week, and 6 = every day or almost every day. The original variable has been recoded into a binary variable that takes 1 if the respondents consumed at least once 5 drinks or more in a row during the last year.

4.1.2 Background Characteristics

To control for compositional characteristics, the models include some indicators of demographic and socioeconomic status. There may be, in fact, interactions between family events and background characteristics. For instance, Harris et al. (2010) observed that early marriage among young adults does not have the protective effect for African Americans as observed for Whites. Race/ethnicity is included: Hispanic, Black, Asian, and White as a reference group. Parents’ education is taken into account with a dummy variable indicating if at least one of the parent has college education. Also family composition at wave I is included. A dummy variable indicates if the respondent lived with both biological parents at the time of the first interview. Last, continuous values of age and age squared (measured in at wave I) are included in the regression models.

4.2 Methods

Life course trajectories are represented by the monthly configuration of union and childbearing status from age 15 to 30. The state space is designed to take six possible values: Single (S), Single Parent (SP), Cohabiting (C), Cohabiting Parent (CP), Married (M), and Married Parent (MP). In sequence analysis, each life course or trajectory is represented as a string of characters, similar to the one used to code DNA molecules in the biological sciences. Thus, every trajectory is composed by a string of (12) * 15 = 180 values. The number of possible combinations is extremely large and it is impossible to treat it with any statistical techniques. From a statistical point of view, sequences can be thought as the realization of a stochastic processes or alternatively as categorical time series. Life course sequences can be represented in several ways. A common approach is to describe the sequence with the state and its duration in time. For instance, an individual that stays single for 24 months, after that he has a cohabitation of 12 months and then she/he marries and stays married for 24 months can represented in this way:

$$ (\hbox{S},24)(\hbox{C},12)(\hbox{M}-24) $$

The sequence in the example describes the union status of a person for a period of 5 years.

Sequences differ in three dimensions: timing, quantum, and ordering. Some basic indicators are defined to measure variations in those three dimensions. The proposed indicators are then used in regression analysis to evaluate the association with health outcomes.

4.2.1 Timing

Timing refers to the duration of events, and specifically to the age at which different transitions happen in the life course. Three indicators are proposed to measure timing:

  • Age at first transition (i.e., the earliest between first union and first child).

  • Age at first union.

  • Age at first child.

The three indicators refer to the period from age 15 to 30. Only individuals who experienced the event by age 30 are considered. In Add Health data, at age 30 94.4 % of women had exited singlehood, 93.6 % had experienced a union, and 64.6 % had become mothers.

4.2.2 Quantum

Quantum indicates the number of events in a trajectory. Two indicators are proposed to evaluate the quantum of a sequence:

  • Number of events from age 15 to 30.

  • Sequence turbulence.

The first is the number of transitions experienced from age 15 to 30 without distinguishing between the type of transitions. The second is an indicator proposed by Elzinga and Liefbroer (2007) that measures the dynamics of a categorical time series. Turbulence takes into account, besides the number of transitions, the duration in different states. The turbulence index is, in fact, a composite measure of two aspects: variability in the time spent in different states and the number of distinct subsequences that can be extracted from the sequence. It gives an overall measure of the grade of disorder of a life trajectory (see e.g., Elzinga et al. 2008; Elzinga and Liefbroer 2007; Widmer and Ritschard 2009)

4.2.3 Ordering

Ordering indicates the order in which events happen during life course. Two indicators are proposed to evaluate the order in a family sequence:

  • Number of normative transitions from age 15 to 30.

  • Number of non-normative transitions from age 15 to 30.

Transitions are divided into two groups: normative and non-normative transitions. Normative transitions are events in life course that are commonly accepted in the society (Rindfuss et al. 1987). In this study, “normative” refers to the sequence of events with this order: Single–Married–Married Parent. Each variation to this pattern is classified as “non-normative”. It follows that premarital childbearing, cohabitation, and any union disruptions are considered non-normative. The concept of normative is certainly arbitrary and relative to the society in which the study takes place. Since long-term cohabitation in the United States is not very common and marriage is still the primary form of union, cohabitation is included in the list of non-normative transitions. Table 1 illustrates the classification of transitions.

Table 1 Normative and non-normative transitions

4.2.4 Regression Models

To examine the relationship between the above indicators and health outcomes, regression models that take into account the effect of selection and confounding variables are used. The aim is to identify if the change in the four outcomes between wave I and wave IV is attributable to some characteristics of family transitions. The time span between the two waves in consideration is around 15 years. In wave I, the respondents are teenagers (age 13–16), while in the last wave they are 30–33 years old. This means that the two time periods considered represent two qualitatively very different periods in life. Health is a continuous process that develops across time. Health in early adulthood is very likely to be influenced by the level of health experienced in adolescence, childhood, infancy, and during mother’s pregnancy. Previous health levels, in turn, influence the family transitions. To account for selection issues, the model includes previous health outcomes as regressors. To examine the impact of these indicators on health, a change (or lagged dependent variable) model is used. This class of regression models sets health at wave IV as a function of the initial level of adolescent health at wave I (Allison 1990; Johnston 1995). The model includes the characteristics of the trajectory, a set of time-invariant socioeconomic status (SES), and control variables measured at wave I.

The simple model is depicted in Eq. (1)

$$ g(E[Y_{ij2}])=\gamma_j D_i+ \rho_j Y_{i1}+\beta_j X_{i1} \quad\hbox{for}\;j=1, \ldots,4;\quad i=1, \ldots, n. $$
(1)

Here Y ij2 represents the jth health indicator measured at wave IV (Time 2) for person i and Y i1 represents a vector of identical health measures at wave I (Time 1). X i1 is a vector of demographic controls and SES background at wave I. The vector D i represents the characteristics of the sequence from wave I to wave IV. In line with previous studies, self-reported health and depression are treated as continuous variables and modeled using linear regression (Amato and Kane 2011). Model diagnostics, available from the author upon request, confirm that residuals are normally distributed and thus consistent with a linear regression model. Moreover, to take into account possible heteroskedasticity in the residuals, robust standard errors have been estimated (White 1980). As indicated by Angrist and Pischke (2008, pp. 45–46), these standard errors are said to be robust because, in large enough samples, they provide accurate hypothesis test and confidence intervals given minimal assumptions about the data and model. Smoking and drinking behavior have been analyzed using a logistic regression. The link function g(·) in Eq. (1) depends on the appropriate statistical model used for the analysis: identity for self-reported health and depression, logit for smoking and drinking.

4.2.5 Extracting Typologies of Life Trajectories

The indicators proposed in the previous section are useful to describe some characteristics of the life trajectory. However, they do not give any indication about the “type” of sequence. To describe family trajectories completely, we need to study simultaneously timing, quantum, and ordering in life course sequences (Billari 2005). The complexity of life course suggests the adoption of an holistic approach, where all the different components of the life course are taken into account. Abbott (1995) was the first to introduce sequence analysis in the social sciences using OM algorithm as a method to compare different life sequences. This method has been used for the alignment of biosequences. The basic idea behind OM is to measure the dissimilarity of two sequences by considering how much effort is required to transform one sequence into the other one. Transforming sequences entails three basic operations in this very elementary method:

  • insertion

  • deletion

  • substitution

A specific cost can be assigned to each operation, and the total cost of applying a series of elementary operations can be computed as the sum of the costs of single operations. Thus, the distance between two sequences can be defined as the minimum cost of transforming one sequence into the other one. Hence, the resulting output is a symmetric matrix of pairwise distances that can be used for further statistical analysis, mainly multivariate analysis. OM is a family of dissimilarity measures between sequences derived from the distance originally proposed in the field of information theory and computer science by Levenshtein (1965), with the difference that in OM the three operations have different costs (Lesnard 2006). The choice of the operations’ costs determines the matching procedure and influences the results obtained. This is a major concern about the use of this technique in social sciences (Wu 2000). A common solution for assessing the substitution costs is to use the inverse of the transition probability, in order to assign higher costs to the less common transitions (Piccarreta and Billari 2007). This strategy is adopted in the empirical analysis.

Sequence analysis has been adopted in demography to study complex phenomenon in order to simultaneously study multiple demographic transitions (see e.g., Billari 2001; Barban and Billari 2012). Once the dissimilarity matrix has been obtained, we can apply standard reduction techniques to classify trajectories into homogeneous groups. The resulting groups are then used to describe “typical” patterns of transitions. Following the approach of McVicar and Anyadike-Danes (2002), a cluster analysis using Ward’s algorithm identifies six clusters of life sequences. Clusters can be described by choosing a representative sequence. Aassve et al. (2007) suggest groups be identified using the medoid sequence, that is the sequence with the minimum distance from all of the other sequences in that cluster. The advantage of using medoid sequences is to define the cluster using a real sequence that best represents the groups.

This group characterization of life sequences can be used as an input for further analysis, in particular regression analysis, in order to explore the consequences of different life trajectories. For instance, Mouw (2005) uses the output of a clustering procedure as an input for a regression analysis under the heading “Does the sequence matter?” Regression analyses show important differences in the risk of experiencing outcomes such as poverty at age 35. Sequences are also found to influence subsequent happiness and depression status. In this study, the consequences of family trajectories on health outcomes are analyzed. Typical trajectories are derided by data using cluster analysis of family sequences from age 15 to 30. Only sequences for age 15–30 are considered, in order to have sequences of the same length for all the individuals. The resulting groups are then used as a categorical variable in a regression analysis. Using different “typologies” of trajectory allow analysis of the change in health status among different groups of individuals. This clustering procedure, for instance, identify the group of single mothers who experience the birth of the first child outside a union, and do not experience stable union after childbearing. It is important, from a policy point of view, to understand if any particular trajectory is associated with a decrease in health status. However, health status is measured at different ages for different individuals. This creates an asynchrony between the outcome and the time used to describe the covariate. The ideal situation would be to have individuals interviewed at the same age. To control for age effects age and age squared are included into the estimation.

5 Analysis of Trajectories

It is important to examine events in the initial years of early adulthood because the large-scale changes in cohabitation, marriage, and non-marital fertility have particularly affected women in age 20–30. In terms of family transitions, those years are very “dense” (Rindfuss 1991), with more demographic events occurring than during any other part of the life course. Figure 1 shows the distribution of family states from age 15 to 30. At age 30 very few women are single (because they did not enter a union, or because of a disruption), 55 % are married, and 18 % are cohabiting. Cohabitation is more frequent than marriage until age 23, then it slightly decreases at later ages. Motherhood increases with time, but it is predominant within marriage. Forty-four percent of 30-year-old women are married and have at least a child (MP), while 11 % are cohabiting mothers (CP), and 11 % are Single Mothers (SP). Only 35 % are childless and most of them are single.

Fig. 1
figure 1

Distribution of family states. Women aged 15–30, weighted frequencies

The distribution of family states gives a picture of family states by age, but gives no indication about the dynamic of trajectories. Table 2 shows the most frequent trajectories observed among women aged 15–30. The representation in Table 2 does not take into account the length of time in a state, but only the order of events. The most frequently occurring pattern (11 % of the sequences) includes cohabitation before marriage. The normative pattern of transitions is the second most common. Women who follow this pattern do not experience cohabitation. Only the fifth pattern contains individuals who do not experience any transition, while the sixth and the seventh indicate the presence of an union disruption. The first ten patterns cover 52 % of all cases.

Table 2 Ten most common sequence patterns of transitions in women 15–30

Table 3 reports the mean value of indicators of timing, quantum, and ordering for women, conditional on their health status. Individuals with poor health status and depression symptoms have their first family transitions earlier than others. They usually experience more transitions, in particular the non-normative ones. Similarly, smoking and drinking behavior are associated with early exit from singlehood, younger age at first union and first child and more non-normative transitions.

Table 3 Indicators of timing, quantum, and ordering and health status

5.1 Results

Early transitions have a negative effect on self-reported health and smoking behavior. Table 4 reports the results of the regression analysis. These results indicate that, controlling for previous health and compositional characteristics, transitions under age 18 are associated with poor self-reported health and smoking. The dynamic of family trajectories has a similar effect. The number of transitions is associated with negative effect on self-reported health and smoking behavior. The more transitions a woman experience between wave I and wave IV, the more she is likely to smoke and report poor health (see Table 5). It is interesting to notice, however, what happens if we decompose the transitions into normative and non-normative (the distinction between normative transitions and non-normative is defined in Table 1). Results in Table 6 show that the two types of sequences have an opposite effect. While non-normative transitions have a negative effect on health outcomes, normative transitions are associated with less unhealthy behavior. Non-normative transitions are associated with a decrease in self-reported health and an increase in depression symptoms. Concerning smoking and drinking behaviors, we observe a protection effect given by normative transitions. Traditional family formation is therefore associated with reduction of risky behaviors. Controlling for other variables, non-normative transitions are associated with increase in the number of cigarette smoked and drinking occasions. Possible explanations are that non-normative transitions constitute major sources of stress. On the other hand, people who follow a normative path may receive greater support from friends and family.

Table 4 Regression estimates
Table 5 Regression estimates
Table 6 Regression estimates

The estimate results in Tables 4, 5, and 6 show similar levels of correlation between health outcomes in wave I and wave IV. The inclusion of previous health outcomes allows the selection issues to be taken into account. Results indicate that health status persist over time, since we can observe a temporal dependence in health status. Moreover, it is interesting to note that depression and smoking have cross effects on the other outcomes. Depression at wave I, in fact, is associated with poorer self-reported health, while previous level of smoking is associated with higher probability of engaging in drinking behavior during early adulthood. The models also include background variables indicating race composition, socioeconomic status, and the family composition at the beginning of the transition. Although previous health outcomes control for health selection, background characteristics can affect the level of health at wave IV net of previous health outcomes. Estimates show that women with college educated parents have higher health outcomes and less propensity to smoke. The propensity to engage in risky behavior varies with race. Black and Hispanic girls tend to smoke and drink less than their White counterparts. Moreover, African American women have a general tendency to report lower levels of health. Overall, these results show that women who move away from a traditional pattern have greater risk of reporting poor health and above all to engage in risky behaviors. Therefore, these results show that timing, quantum, and sequencing are important factors in the study of family formation.

6 Typologies of Family Trajectories

The analyses presented in the previous section show that women who move away from a normative model (especially in terms of age at first transition and order of events) are the ones who are more likely to experience decline in health status. Poor health outcomes are associated with early transitions, high numbers of changes in family status, and non-normative order of events. Traditional transitions seem instead to have a protective effect, especially on behavior.

Anyhow, previous analysis do not identify what type of family patterns are associated with changes in health status. From a policy point of view, we are interested in detecting what subgroups of population are at risk of experiencing poor health, for example, single motherhood (Furstenberg 2005, 1998, 1976). Previous studies show lower levels of health among single mothers, in particular mental health (Cairney et al. 2003), propensity to smoke (Francesconi et al. 2010), and also higher level of mortality (Mirowsky 2005). Therefore, it is relevant to study the consequences of different patterns in family formation.

The number of possible combinations of sequences in family formation is almost unlimited. It follows that a convenient empirical strategy aims to reduce all the possible trajectories to a more manageable number. Cluster analysis identify six groups of trajectories as representative of the entire set of sequences. Below, a description of the sequences in each group is presented (Fig. 2). Clusters are described using their medoid sequences.

  1. 1.

    Married mothers (S, 73)(C, 11)(M, 12)(MP, 84); n = 693. This is the largest group in the sample (29 %). It is composed of women who follow a more traditional pattern, e.g., single women that marry and then have children. The medoid sequence represents a woman who enter a cohabitation at age 21, she marries after 11 months of cohabitation and she has a child the following year at age 23. Almost all of the women in this group experience both marriage and motherhood. Cohabitation in this group is not rare, but generally short (less than a year in average). Women in this class start family transition earlier than women in other groups (with the exception of single and cohabiting mothers). Although the number of transitions is comparable with the other groups, the number of non-normative transitions is limited.

  2. 2.

    Late transitions (S, 168)(C, 12); n = 648. This group consists of women who start family transition very late or those who have not experienced any transition by age 30. They stay single for the majority of the sequence and they eventually experience a transition to cohabitation. Very few of them are married or have a child by age 30. The medoid sequence represents a woman who enters a cohabitation at age 29.

  3. 3.

    Married women without children (S, 97)(C, 12)(M, 71); n = 315. This group differs from group 1, essentially for two reasons. Women in this group begin the family transition later and they remain married without a child longer. The average time in which they stay married without children (M) is two and half years, compared to 1 year in group 1. The result is that the majority of women in this group postpone childbearing until age 30. The majority of transitions is traditional and cohabitation is generally short. Above all, this group is characterized by a postponement of the traditional pattern.

  4. 4.

    Single Mothers (S, 71)(SP, 90)(CP, 19); n = 302. This group identifies women who became mothers without being in a partnership. The group is characterized by very early transition to motherhood. Although there are some experiences of cohabitation, most of the time is spent outside a union. Women in this group experience on average more transitions than women in other groups. The majority of transitions are non-traditional. Single mothers are more likely to experience more than one cohabitation union.

  5. 5.

    Cohabiting mothers (S,59)(C,15)(CP,106); n = 237. Women in this group differ from single mothers mainly for the fact that childbearing occurs during a cohabitation. This group is characterized by early transitions both to union and to motherhood. Similarly to single mothers, they experience a large number of transitions, most of them non-normative.

  6. 6.

    Cohabitating women (S,92)(C,84)(M,4); n = 163. The last group is characterized by cohabitation. It accounts for roughly 7 % of women in the sample. Trajectories in this class are similar to group 2 (late transitions), with the difference that women in this group are more likely to enter a cohabitation. The number of transitions is relatively low. Childbearing is postponed to later age.

Groups differ in terms of compositional characteristics, in particular race composition and socioeconomic status (see Table 7). Groups 4 and 5 have a higher proportion of African American women. These two groups seem to be the more disadvantaged in terms of family resources. Their family’s income is noticeably inferior and a high proportion of them was not living with two biological parents at wave I. On the other hand, women in groups 2 and 3 seem to be more advantaged in terms of family income, education, and family composition.

Fig. 2
figure 2

Distribution of states

Table 7 Descriptive statistics for typical sequences

Single mothers, cohabiting mothers, and cohabiting women (groups 3, 4, and 6) report a lower level of health at wave IV (Table 7). The same groups also have higher probability of depression symptoms. This is partially explained by selection, since the same groups also have lower levels of health during wave I. Single and cohabiting mothers have a greater propensity to smoke at wave IV. Drinking behavior, instead, is more frequent among cohabiting women and women who experience late transitions. Although we observe a general reduction in smoking from adolescence to adulthood, women who postpone family transitions (group 2) are the ones who have the biggest decrease.

To investigate the relationship between health and family trajectories, the same estimation strategy used in the previous section is applied. Since family trajectories are subject to selection issues and confounding variables, the models control for previous health outcomes (wave I) and compositional characteristics in the regression models. The choice of the family pattern is very likely to be influenced by variables that are omitted in the regression model. Also the effect of reverse causation may not be negligible. On the other hand, the explanatory variable is only a representation of a variety of trajectories and it cannot be thought as a treatment that is randomly assigned to the population. For this reason, the estimation results presented in Table 8 only indicate a statistical association and they do not have a causal significance. Nevertheless, results show some interesting aspects of the relationship between health and family formation.

Table 8 Regression estimates

First, single mothers, women who have a child in early age and the ones who cohabit without children have lower self-reported health. While women with a traditional pattern do not differ significantly from women who postpone family transitions. Second, cohabiting mothers are more likely to experience depression compared to other groups. Single mothers do not differ from the reference group concerning depression. A possible explanation is that depression is associated with the cohabiting experience, or in other terms with union instability. Last, drinking behavior appears to be influenced by family patterns. Traditional trajectories seem to have a protective effect on the drinking behavior of women. Controlling for other variables, women of groups 1 have lower probability of engaging in drinking behavior. This is consistent with other studies that show how marriage has a strong effect in reducing risky behaviors (Duncan et al. 2006). However, it would be interesting to understand if this protective effect remains constant in time or if it has only a temporary effect. Overall, parents education has a positive effect on health—both physical and mental—and a reduction on cigarette smoking. Race has a mixed effect. Black women report less perceived health levels, but at the same time are less likely to engage in drinking behavior.

7 Discussion

Health is the result of a continuous process that develops over an individual’s lifetime. Health trajectories are the consequence of a multitude of factors coming from genetic, biological, behavioral, social, and economic contexts. Previous studies indicate that health is certainly connected with family events occurring during the life course. Following the approach of Giele and Elder (1998), the analysis distinguish between transitions (changes in family status) and trajectories (the whole sequence of transitions) in order to study jointly union formations and childbearing. Although the study of the dynamic inter-relationship between health and the life course has recently been an emerging topic, there is no general agreement on how trajectories should be conceptualized and analyzed. In this paper, a sequence analysis approach is used to describe life course trajectories. Describing family biographies as sequences of family states allows analysis of different dimensions of the life course. In particular, the study examines if there is a direct effect of timing, quantum, and ordering on health outcomes for young women. It emerges that, controlling for selection and background characteristics, differences in these dimensions affect health status. Early transitions have negative repercussions on self-reported health and smoking behavior (Hypothesis 1). Although the experience of a large number of transitions is associated with negative effects (Hypothesis 2), some particular transitions have a protective effect. Normative transitions (i.e., traditional unions, childbearing after marriage) have protective effects on behaviors (Hypothesis 3). Women with numerous normative transitions, in fact, smoke and drink less. Above all, the indicators proposed indicate that sequence characteristics matter. In particular, it seems that moving away from normative family patterns (in terms of age roles and order of events) is associated with a decrease in wellbeing.

In the second part, the paper focuses on the consequences of different typologies of trajectories. Six classes representing typical patterns of family formation are derived by data. Differences in terms of wellbeing and propensity to risky behaviors are substantial. Once selection and background characteristics are controlled, these differences are attenuated but still significant. Empirical results show that women with short experiences of cohabitation and women with a traditional pattern do not differ significantly from women who postpone family transitions. On the other hand, early childbearing and long cohabitation are associated with poor health status. Moreover, married women are less likely to consume alcohol. Controlling for previous health outcomes, no significant differences are observed in smoking behavior. These analyses partially confirm previous studies, in particular regarding the “protection effect” of marriage. Although selection and social background play most of the role, we still observe negative outcomes for women who experience early childbearing. We do not find much difference, instead, between single mothers and young mothers that have a child during a cohabitation.

Results show that early childbearing is associated with worse health outcomes. It is possible that women who anticipate motherhood have fewer resources (in terms of human and social capital) to tackle the stress of raising a child (especially if without a stable partner). Another complementary explanation is that early mothers are disadvantaged in the marriage market and they have difficulties making a match with good men. Our results also show that married women are less likely to drink, confirming a “protection effect” of marriage. Cohabitation seems to have no negative effect if short and followed by a marriage. On the other hand, it is associated with poor outcomes (especially propensity to smoking and drinking) when it is persistent and accompanied by motherhood. The mechanism of these associations, in fact, is beyond the scope of this work. Nevertheless, these results give evidence that family trajectories matter.

It would be interesting, in the future, to investigate if differences in health outcomes persist during the life course to see if the more disadvantaged groups are able to catch up with the others. Another open issue is the interaction between family transitions and social class. It may be, in fact, that family trajectories have different effects according to the socioeconomic status of the family of origin. For example, the risk associated with non-normative transitions may not affect women coming from higher social class. Last, this study only deals with young women and ignores men. Comparing the trajectories of partners might help to understand the effect of previous family transitions in the marriage market. In any case, this study represents one of the first attempt to use a complete life course perspective to study the association between health and family formation.