Introduction

Early marriage and motherhood have captured significant attention of researchers over the years. The strong interest is largely a function of the adverse effects that early marriage and motherhood can have on health, educational, and behavioral outcomes of females and their children (Harden et al. 2007; Lloyd and Mensch 2008; Meade et al. 2008; Shaw et al. 2006; Whitley and Kirmayer 2008). An important strand of work in this area has focused on identifying key determinants of early marriage and motherhood. Recent research has shown, for example, that family background, parents’ and the girls’ education, poverty, adverse shocks, and the local behavioral environment are important determining factors (Bongaarts et al. 2017; Glick et al. 2015; Ní Bhrolcháin and Beaujouan 2012; Were 2007). However, no research has yet investigated the impact of school starting age (SSA) on early marriage, and only a few studies have considered SSA effects on teenage motherhood. This paper takes up these central questions.

Children start school at different ages as a function of government regulations that determine the timing of school entry based on date of birth. Children born before the official cut-off date enter school 1 year earlier than those born after the cut-off, presuming those regulations are followed. SSA may affect various child and adult outcomes, including those related to school performance, educational attainment, and post-education earnings (Bedard and Dhuey 2012; Chen 2017; Crawford et al. 2014; Datar 2006; Dobkin and Ferreira 2010; Elder and Lubotsky 2009; Fredriksson and Öckert 2014; Larsen and Solli 2017; Matta et al. 2016; McEwan and Shapiro 2008; Puhani and Weber 2007; Robertson 2011). Most research finds that starting school earlier leads to disadvantageous outcomes, although some investigations find opposing effects while still others find no impact at all.Footnote 1

SSA impact depends on the relative importance of mechanisms through which the effects are transmitted. In this context, there are three main schools of thought. Some analysts stress the importance of relative age or peer effects (Larsen and Solli 2017). In this case, the argument is that children who start school earlier, and who are therefore relatively younger than their classmates, are adversely influenced by older peers and are more likely to engage in risky behavior as a result. Negative peer effects may be especially important for school-aged girls (Argys et al. 2006). On the other hand, children who are relatively older than their fellow students are more likely to have higher test scores and possess relatively more self-esteem and leadership abilities, among other positive effects (Dhuey and Lipscomb 2008; Thompson et al. 2004).Footnote 2

Other authors have focused on the incapacitation (or enrollment) effects of education. Here the argument is that children who start school earlier also finish their studies earlier, or, alternatively, are more likely to drop out of school.Footnote 3 As such, they are therefore more apt to engage in unsafe behavior than their counterparts who are still in school, essentially because they have relatively more time on their hands to do so (Lochner and Moretti 2004).

A third point of view leads to qualitatively different SSA-induced outcomes. The mechanism in this instance relates to children’s years of schooling and the resultant accrued human capital. The argument is that children who start school earlier tend to have more years of schooling than those who start school later, at any given age throughout their school years, and as a result they have accumulated more human capital. Higher levels of human capital lead to positive outcomes, it is argued (Lochner and Moretti 2004).Footnote 4

As seen in this short description of mechanisms, the impact of SSA may be either positive or negative, depending on the relative magnitudes of the effects transmitted through the various competing channels. Ultimately, therefore, the direction of effects can only be determined empirically.

This paper examines the effects of SSA on teenage marriage and motherhood in Vietnam. To the best of the authors’ knowledge, this is the first study to investigate the impact of SSA on female teenage marriage in any country. In addition, we add to a very small literature on the impact of SSA on teenage motherhood; moreover, the studies comprising this literature offer inconsistent results regarding SSA effects. Specifically, only three other recent studies have examined SSA effects on fertility: Tan (2017), McCrary and Royer (2011) and Black et al. (2011). The two former investigations find that SSA has no statistically significant impact on the probability that women will ever become mothers or on the age at which women give birth, in the US.Footnote 5 The latter study, however, determines that girls in Norway who start school earlier are more likely to become pregnant as teenagers than girls who start school at relatively older ages. Our study provides fresh evidence on the controversial effects of SSA on the fertility of young women. Moreover, unlike these previous studies, which examined the impacts of SSA in education systems that have compulsory schooling regulations, this paper investigates the effect of SSA in Vietnam, which does not have compulsory schooling. This latter feature of our analysis makes Vietnam a compelling choice as a case study in this context.

We find that SSA significantly affects teenage marriage and motherhood in Vietnam. We determine that starting school earlier leads to a rise in teenage marriage for 18–19-year-old girls, although it has no effect on marriage status of the 15–17 age cohort. Starting school earlier also causes an increase in teenage motherhood for girls aged 15 to 19. We also find that the impact of SSA is heterogeneous across subgroups as defined by girls’: ethnicity, mothers’ level of education, and household wealth. The significant effects of SSA are concentrated among teenage girls who are members of ethnic minorities, whose mothers have relatively less education, and whose households are comparatively poor. Girls classified as above are more likely to benefit from starting school later.

Background: Education and Early Marriage and Motherhood in Vietnam

Education

Pre-tertiary education in Vietnam comprises three levels of schooling: primary (5 years), lower secondary (4 years), and upper secondary (3 years). In Vietnam, the school year starts in the first week of September (which it has done since 1945 when the nation became independent) and runs until the end of May the following year. Children start school (i.e., enter first grade) in September of the calendar year in which they turn 6 years of age as regulated in the government’s Education Law. This implies that if students were to enter school as government regulations insist and progress without interruption or grade repetition, they would finish grade 12 in the year in which they turn 18 years old. Although school starting age is stipulated in the Education Law, there are no regulations regarding minimum school leaving age, minimum required years of schooling, or minimum grade completion.

Vietnam achieved universal primary education in 2000. By 2013, the net primary school enrollment rate had risen to 97% (MOET 2015). The share of 6-year-old children entering grade one was 94% in 2006 (UNICEF 2010) and 99.3% in 2017.Footnote 6 Having realized universal primary schooling, government set its sights on increasing enrollments at the lower secondary level. Degree 88/2001/NĐ-CP of 2001 formally set a target of achieving universal access to lower secondary school by 2010. In the school year 2008–2009, all districts and provinces in the country reported that they had already reached the objective (MOET 2015). Net enrollment rates for lower secondary education have rapidly increased, from 30% in 1993 to 79% in 2008 (London 2011) and 88% in 2013 (MOET 2015). Net enrollment rates for upper secondary education have also grown quickly, surging to 54% in 2008 from 7% in 1993 (London 2011). Some provinces have already initiated programs to reach universal upper secondary education (MOET 2015).

Despite these achievements related to improved schooling access, considerable inequalities exist in educational attainment, especially between children belonging to Kinh majority and non-Kinh minority ethnic groups and relative to household income status. In 2010, differences in enrollment rates in primary, lower secondary, and upper secondary schools for Kinh and non-Kinh children were 8, 26, and 35%, respectively (GSO 2010). Educational gaps between children from low and high-income families exist across all age groups. More than a half of children who belong to the first income quintile drop out school when they are between 15 and 17 years of age, compared to just 16% of those in the fifth income quintile (Quyen 2011).

On a more positive note, Vietnam is moving rapidly toward achieving gender equality in education access. In 1990, the net secondary school enrollment for girls was 5% lower than that for boys. However, by 2010, enrollment rates for girls exceeded that for boys: 83% vs. 80% for lower secondary education and 63% vs. 54% for upper secondary school, for females and males, respectively (London 2011).

Early Marriage and Motherhood

Early Vietnamese culture was significantly influenced by Chinese doctrines and feudalism, which predominated in the country until independence in 1945. In this context, arranged early marriage and motherhood were common (Nguyen et al. 2016). After independence, the Vietnamese government attempted to raise the education levels of its people, with a view to breaking out of the feudal mind-set, starting with the implementation of a large literacy campaign (Nguyen and Nguyen 2008). Subsequently, in 1958, government issued a new Law on Marriage and Family, stipulating that the minimum age for marriage would be 18 for women and 20 for men. Rising literacy, the collapse of feudalism, and minimum marriage age requirements together led to significant declines in early marriage and parenthood. By the early 1960s, the number of births per 1000 women aged 15–19 had stabilized at around 19 in Vietnam, substantially lower than 48, the average figure for all Southeast Asia and Pacific countries (WB 2018).

With the 1986 initiation of the “Doi Moi” economic reforms and the attendant rise in globalization, young Vietnamese began to be exposed to a broader set of international norms. The exposure influenced considerably their expectations and expressions regarding sexual life. As a result, the adolescent fertility rate (births per 1000 girls) in Vietnam increased rapidly (Mestechkina et al. 2014; Ngo et al. 2008), reaching 34 in 1992 and peaking at 39 in 2015. The latter is substantially higher than the average rate of about 22 for all countries in East Asia and Pacific region (WB 2018). Results from national surveys on Vietnamese Youth conducted in 2003 and 2008 show that for every 1000 adolescent girls aged 14–19  years, 40 have experienced pregnancy (Nguyen et al. 2016). Similarly, the share of young population marrying before the legal age has increased significantly. In 2015, 11% of women aged 20–24 years married before they were 18 years old, a substantial increase from 5.4% in 2006. Among ethnic minority groups as a whole, the child marriage rate is around 30% (GSO and UNICEF 2015). The comparable figure for East Asian and Pacific region was 15% in 2016 (UNICEF 2010).

Data and Variables

Data used in this paper derive from a 15% random sample drawn from the 2009 Vietnam Population and Housing Census, as provided by IPUMS International (Minnesota Population Center 2019). The dataset contains information on the marriage status of household members aged 15 years and older and on the extent to which female members of the household between the ages of 15 and 49 have given birth, among others. The census date was April 1, 2009, and therefore the youngest age cohort for which data on marriage and fertility are available is that for girls born in March 1994, who would have been 15 years old at the time of the census.

We use sample information on marriage and fertility as above to construct our main dependent variables of interest in this study: teenage marriage and teenage motherhood. We define a dummy variable for teenage marriage that is equal to one if a female between the ages of 15 and 19 (inclusive) is part of a formal or informal marital union with a male, regardless of the latter’s age, else zero.Footnote 7 Our teenage motherhood variable is also a dummy, set equal to one if a similarly aged female has given birth at least once, otherwise zero. It is important to note that that information about age at marriage or age at first birth is not available in the census. All information on age, marriage, and motherhood status is valid at the time of the census, April 1, 2009. This means that information on marriage and motherhood, especially that of younger girls, is censored at the time of the census. We will explain how our methodology resolves this issue in the next section.

We use years of schooling and enrollment as our outcome variables to investigate the mechanisms through which SSA affects teenage marriage and motherhood. Years of schooling is the highest year of education that a girl has attained. Enrollment is a dummy variable equal to one if a girl is enrolled in school, college, or university, else zero.

The sample also includes data on the month and year of birth of all household members. We use this information to define a variable that relates a girl’s month of birth to the school entry cut-off date. As discussed above, the Education Law stipulates that the SSA is 6 years of age and that January 1 is the cut-off for determining a child’s school age. Accordingly, girls born before January 1 enter school 1 year earlier than those born on or after January 1, thus leading to a discontinuity in SSA, which we use to operationalize our regression discontinuity (RD) design. We normalize month of birth as the number of months before and after the January 1 cut-off.

Normalized month of birth—the running variable in the RD analysis—ranges from negative six (for girls born in July) to six (for those born in June). Following Black et al. (2011) we redefine age cohorts to include females born from July to June rather than from January to December. Hereafter, age refers to the redefined age, unless otherwise specified. The final sample comprises more than 408,000 girls aged 15–17 and approximately 282,000 girls aged 18–19.

Table 1 provides summary statistics of outcome variables by cohort and month of birth relative to the school entry cut-off. In the sample, 3.7% of 15–17-year-old girls and 17.6% of 18–19-year-old girls were married at the time of census. The comparable figures for teenage motherhood are 1.2% and 9%. Exploring the data we find that only 2% of teenage mothers had not married at the time of census while 54% of married girls had given birth. The shares of girls who experienced marriage and motherhood are higher for groups of girls born before the school entry cut-off date. Similarly, girls born before the school entry cut-off date have more years of schooling, as would be expected, because they started school 1 year earlier than girls born after the cut-off. However, girls born before the entry cut-off date are less likely to be enrolled in school than those born after.

Table 1 Summary statistics of outcome variables by age cohorts and the relative month of birth to the cut-off date for school entry

Finally, the dataset also contains information on socioeconomic characteristics of girls, their parents, and their households: ethnicity, presence of disabilities, parents’ level of education, number of siblings, and household wealth. Ethnicity is a dummy variable, set equal to one if the girl is non-Kinh and zero if Kinh. Disability is a dummy with value of one if the girl is mentally or physically disabled, and zero otherwise. For parental education, we construct dummy variables for both mothers and fathers that indicate whether the parent had attained less than primary education (= 1) or equal to or more (= 0). Number of siblings is the number of the girls’ older or younger brothers and sisters, whether living in the household at the time of the census or not. We form an index of household wealth by taking the weighted average of ten binary variables: whether the household has access to electricity and piped water (= 0) or not (= 1) and whether the household is in possession of (landline) phone, radio, television, computer, washing machine, refrigerator, air conditioner, and flush toilet (= 0) or not (= 1).Footnote 8 Summary statistics for covariates used in the analysis are presented in Table 2. The table demonstrates that girls in the two different age cohorts are quite similar as regards covariates considered here.

Table 2 Summary statistics of covariates by age cohorts

Identification

We use regression discontinuity (RD) methods to identify the causal effects of SSA on teenage marriage and motherhood in Vietnam. Under the assumption that girls start school at age six and the cut-off date for determining a child’s school age is January 1, we can use a girl’s month and year of birth to establish when she should have entered school relative to the age cut-off. We term this relative age the normalized month of birth. In the RD model, a girl’s normalized month of birth serves as the running variable and the specific date January 1 provides the threshold or cut-off. Girls born just before the cut-off enter school 1 year earlier than girls born immediately after the cut-off. In the language of RD, girls born after the cut-off comprise the treatment group and girls born before the cut-off constitute the control group.Footnote 9

Following Imbens and Lemieux (2008), define Yi (0) and Yi (1) to be potential marriage and motherhood outcomes for girl i where Yi (0) is the outcome for girls born before the cut-off (i.e., those to the left of the cut-off) and Yi (1) is the outcome for girls born after the cut-off (i.e., those to the right of the cut-off). In this case, the impact of SSA on marriage and motherhood is given by Yi (1) − Yi (0). Unfortunately, Yi (0) and Yi (1) cannot be observed simultaneously, and so attention turns to the average effects of treatment, Yi (1) − Yi (0), across girl subgroups. Let Di = 0 if a girl’s birthday is to the left of the cut-off and Di = 1 if her birthday is to the right of the cut-off. Observed outcomes, Yi, are therefore = Yi (0) if Di = 0 and = Yi (1) if Di = 1. The average causal effect of relative age, τ, at the cut-off, c, is given by:

$$\tau = {\rm E}\left[ {Y_{i} (1) - Y_{i} (0)|X_{i} = c} \right] = {\rm E}\left[ {(Y_{i} (1)|X_{i} = c} \right] - {\rm E}\left[ {Y_{i} (0)|X_{i} = c} \right]$$
(1)

The key identifying assumption in this framework is that E [Yi (1) | Xi] and E [Yi (0) | Xi] are continuous in X, girls’ normalized month of birth. This implies that all other unobserved determinants of marriage and motherhood, Y, are also continuously related to X (Imbens and Lemieux 2008). The implication allows one to use outcomes just below the cut-off as valid counterfactuals for those just above the cut-off (Cook and Kang 2016; de la Cuesta and Imai 2016; Imbens and Lemieux 2008). The general form of the estimating equation is:

$$Y_{i} = \tau D_{i} + g(X_{i} ) + \mu_{i}$$
(2)

In Eq. (2), g(X) is a polynomial function of normalized month of birth X; µ is the error term; and τ is the treatment effect, which is to be estimated.

In theory, Eq. (2) can be estimated by either nonparametric or parametric methods. Non-parametric estimation relies on continuously shrinking the bandwidth within which estimates are made and comparing observed outcomes just above the threshold with those just below. However, when the running variable is discrete, as is the case here, there are no observations just above or just below the threshold, and therefore the needed comparison cannot be made in the manner described (Lee and Card 2008). In this case, suggested practice is to estimate a parametric regression of Y on lower-order polynomials of X, where identification is achieved through extrapolation, as based on the estimated relationship between Y and X (Dong 2015).

Recent research argues for the use of lower-order polynomials in regressions of Y on X (Gelman and Imbens 2017), and we employ a polynomial of degree one in our analysis.Footnote 10 We estimate the following equation:

$$Y_{i} = \alpha + \tau D_{i} + \beta_{1} X_{i} + \beta_{2} D_{i} X_{i} + \mu_{i}$$
(3)

where all variables and parameters have been previously defined.

We estimate Eq. (3) using ordinary least squares (OLS) within narrow windows (bandwidths) on each side of the cut-off as per usual practice. We use a bandwidth of two months around the cut-off. A bandwidth of two is the narrowest available. Its use minimizes selection bias in the estimation of treatment effects, especially as related to season of birth (Buckles and Hungerman 2013).Footnote 11We also use bandwidths of 3 months and 4 months around the cut-off for robustness check (see Online Appendix). We use an inverted distance weighting scheme, which places more weight on observations close to the cut-off in our regressions (Anderberg and Zhu 2014; Gibbons et al. 2013; Machin et al. 2011). Here, a girl born n months from the cut-off receives a weight of 1/n. Therefore, for example, the weight for a girl born in January or December is one, in February or November it is one-half, and so on. We also control for age cohort fixed effects in the regressions, as is common practice in these types of analyses (Cook and Kang 2016).Footnote 12 We cluster standard errors on girl’s month of birth, as is typical when the running variable is discrete.

Finally, we also adjust our OLS treatment effects estimates to address possible inconsistencies resulting from the use of a discrete and rounded running variable (Dong 2015). The adjusted treatment effect, \(\tau_{adj}\), can be written as follows:

$$\tau_{adj} = \hat{\tau } - \tfrac{1}{2}\hat{\beta }_{2}$$
(4)

where \(\hat{\tau }\) and \(\hat{\beta }_{2}\) are estimated parameters from Eq. (3). Standard errors of the adjusted estimated treatment effect are obtained by bootstrapping (Dong 2015).

As mentioned in Sect. 3, all information on marriage and motherhood status, especially that of younger girls, is censored at the time of the census. At the time of the census, younger girls will have had shorter periods of exposure to teenage years, and thus, a lower likelihood of experiencing teenage marriage and motherhood than older girls. The difference in exposure might, in theory, drive differences in derived outcomes. However, our RDD approach accommodates the difference in exposure to teenage years through the running variable. In this context, given that age cohort fixed effects have been controlled for in the model, the estimated coefficient of the running variable shows the change in probability of teenage marriage/motherhood as age increases by 1 month. Thus, estimated \(\tau\) in Eq. (1) represents the causal effect of SSA on outcomes.

The methods described here identify a local average treatment effect (Lee and Lemieux 2010). It is perhaps useful to emphasize the local character of estimated treatment effects. While the internal validity of effects estimated in the described manner is typically argued to be strong, external validity is usually thought to be relatively weak. This suggests that it may be unreasonable to generalize about the impact of SSA on marriage and motherhood at values of the running variable outside a narrow range around the cut-off.

Main Empirical Results and Robustness Tests

Main Results

We begin the treatment effects analysis by examining the standard RD plots. Figures 1 and 2Footnote 13 provide the plots for teenage marriage and motherhood outcomes, both relative to normalized month of birth for different age cohorts. Each dot in the figure represents the average value of the outcome in question for a data-driven selected range (bin) of girls, ordered by month of birth relative to the cut-off. Attention is drawn to variable relationships at the threshold. All plots appear to show a downward break in both teenage marriage and motherhood at the cut-off. This implies that girls who are to the right of the cut-off are less likely to be married and/or have given birth than those who are to the left of the cut-off. The plots are just suggestive of SSA impacts, however; a firm conclusion can only be made after formal estimation of treatment effects.

Fig. 1
figure 1

RD plots for teenage marriage by age cohort

Fig. 2
figure 2

RD plots for teenage motherhood by age cohort

We now provide formal empirical estimates of the impact of SSA on teenage marriage and teenage motherhood, by estimating Eq. (3). We provide separate estimation results for 15–17-year-old girls, who are still in school, and 18–19-year-old girls, who should have already completed upper secondary school (or nearly done soFootnote 14). Table 3 provides the treatment effects output for both teenage marriage and teenage motherhood.

Table 3 Effect of SSA on teenage marriage and teenage motherhood (bandwidth is 2 months)

The regression results for 15–17-year-old girls show that SSA has no significant effect on teenage marriage but leads to a significant reduction in teenage motherhood. The results suggest that the probability that a girl born after the cut-off becomes a mother is almost 0.4 percentage points lower than that for a girl born before the cut-off. The estimated effect is significant at the 5% level. In Vietnam, the share of 15–17-year-old girls who become mothers is 1.2% (Table 1). As such, starting school late reduces teenage motherhood by about 33% (0.4/1.2) for 15–17-year-old girls.

The estimated treatment effects for girls aged 18–19 years old are highly significant: girls born after the cut-off date, and thus who start school one year later, are significantly less likely to get married and become mothers compared to their counterparts born before the cut-off. Specifically, the output indicates that the probability that a teenage girl born after the cut-off is about 5.5 percentage points less likely to get married and 3.2 percentage points less probable to become mothers than their counterparts who start school earlier.Footnote 15 Given that the share of girls marrying or giving birth between 18 and 19 years of age in Vietnam is 17.6 and 9% (Table 1), respectively, starting school late reduces both teenage marriage (5.5/17.6) and motherhood (3.2/9) by around one third.

Our estimates of the impact of SSA on teenage marriage and motherhood (possibly) comprise three separate influences: relative age effect, incapacitation effect, and years of schooling effect. As already discussed, a girl born before the cut-off will be relatively younger than her classmates and a girl born after the cut-off will be relatively older. The former is also less likely to be enrolled in school than the latter during school age years. Through these two channels, we would expect that a girl born before the cut-off would have a higher probability of teenage marriage and motherhood than a girl born after the cut-off. However, in any given year, a girl born before the school cut-off date will also have completed one more year of school than a girl born after the cut-off. The greater amount of education—i.e., years of schooling—and the associated higher level of human capital accumulation may reduce risky behavior (Lochner and Moretti 2004). Therefore, through this channel, we would expect an opposing impact: a girl born before the cut-off would have a lower probability of teenage marriage and motherhood than a girl born after the cut-off. Given the overall negative impact of SSA on teenage marriage and motherhood found here, we conclude that for Vietnam the effects created through relative age and enrollment channels outweigh the years of schooling effect.

Unfortunately, due to a lack of data, we are unable to make a judgement about the relative significance of peer effects in the determination of SSA impact on teenage marriage and motherhood. We can, however, quantify the importance of enrollment and years of schooling channels. We do so by reestimating Eq. (3), using each of the two variables on the left-hand side of the equation, in turn. Table 4 provides the results. The estimation results for 15–17-year-old (i.e., upper secondary school-aged) girls are as expected: girls who start school later are three percentage points more likely to be enrolled and have 0.6 fewer years of schooling than girls who start school early. SSA also significantly increases enrollment (whether upper secondary or post-upper secondary) for 18–19-year-old girls but has no effect on education attainment for that cohort. The probability that late starting 18–19-year-old girls are enrolled in school is 13 percentage points higher than girls of the same age who start school earlier.Footnote 16 We conclude that both incapacitation and years of schooling effects are important for 15–17-year-old girls, while incapacitation is the only channel through which SSA affects marriage and fertility of 18–19-year-old girls.

Table 4 Effect of SSA on education outcomes (bandwidth is 2 months)

Our results differ from those of Cook and Kang (2016) and Tan (2017), both of which find that children born after the cut-off are more likely to drop out in the last years of high school in North Carolina, USA. Our explanation for this difference relates to school leaving age regulations. In North Carolina, the legally mandated minimum school leaving age is 16, so people born after the cut-off date are exposed to a longer period of required schooling until graduation (grade 12). Late starters, therefore, are more likely to drop out of school in their later years and less likely to finish high school than those born before the cut-off. Both human capital and incapacitation effects favor early starters in North Carolina. The beneficial human capital and incapacitation effects counterbalance the negative influence of older peers, rendering the overall impact of SSA on fertility insignificant.

Vietnam does not regulate school leaving age, so children might well drop out at any time. Among students who intend to finish upper secondary school (grade 12), those born after the cut-off need almost one additional year to do so, making them more likely to be enrolled than those born before the cut-off. In this context, the human capital effect favors early starters and the incapacitation effect benefits late starters.

Black et al. (2011) consider the case of Norwegian children, who face 9 years of compulsory school but no regulations on minimum school leaving age. Norwegian children born after the cut-off complete mandated compulsory education 1 year after children born before the cut-off do. Thus, incapacitation effects would be expected to positively influence teenage fertility for late starters. The significant impact of SSA on fertility found by Black et al. (2011) reflects the fact that the effects of SSA channeled through incapacitation and peer mechanisms outweigh the impact channeled through the human capital mechanism, similar to the case of Vietnam.

Robustness of Results

We investigate the robustness of our empirical results across five dimensions: running variable manipulation, covariate balance, bandwidth size, degree of polynomial, and observation weighting scheme.

For the RD approach to be valid, there must be no precise manipulation of the running variable, child month of birth, near the cut-off point. This would be the case, for example, if parents were to time the birth of their children so that they would be able to enter school either relatively early or late, depending on their preferences. It seems highly unlikely that this would be the case. For example, Dickert-Conlin and Elder (2010) find no evidence of discontinuity in dates of birth around the school entry cut-off in the US. Figure 3 shows the density of child month of birth around the cut-off in Vietnam. The figure does not suggest any apparent, consistent discontinuities in birth month around the threshold, indicated by the vertical line at zero. To examine the possibility of running variable discontinuity more formally, we employ the well-known density test developed by McCrary (2008). Implementation of the test procedure results in an estimated density parameter, \(\hat{\theta }\), of 0.00836 with a standard error of 0.01190, implying that the null hypothesis of no running variable manipulation cannot be rejected. We take this as evidence that the child’s month of birth has not been manipulated by parental preferences regarding the timing of school entry.

Fig. 3
figure 3

Histogram of month of birth

The treatment effects analysis carried out here also assumes that other predetermined covariates (and/or placebo outcomes) are balanced around the threshold. If they were not balanced, then the validity of our identification strategy would be called into question. We test the balance assumption using several key predetermined covariates on which data are available: ethnicity (non-Kinh versus Kinh), disability (existence or not), mother’s and father’s levels of education (less than primary education or above), number of siblings, and the wealth index.

Table 5 supplies the results for our two cohorts: 15–17 and 18–19 years of age. The variables listed down the first column are the covariates of interest. Each covariate is used as the dependent variable in Eq. (3), in turn, where estimation follows the same procedures as earlier described. As can be seen in the table, all of the treatment effects estimates are insignificantly different from zero. We take this output as evidence that covariates are balanced around the cut-off, which provides further support for the claim that our identification strategy is sound.

Table 5 Covariate balance test (bandwidth is 2 months)

Finally, as a further checks on the robustness of our results, we reestimate our various models (i) using a bandwidth of three and four (instead of two) months around the cut-off, (ii) using a polynomial of degree two (rather than one),Footnote 17 and (iii) employing a uniform (in lieu of an inverted distance) weighting scheme for observations in our dataset. The relevant empirical output can be found in Online Appendix Tables A3, A4, and A5. The results are generally supportive of the claim that our original estimation results are robust.

As regards different bandwidths and polynomial degrees, one dissimilarity is that that the impact of SSA on years of schooling for the 18–19 age cohort is now statistically significant for polynomial of degree one at bandwidths of 3 and 4 months, although only at the 10% level (Table A3). A similar result is found when using the uniform weighting scheme (Table A4). These differences would seem to be of little real importance. In addition, a small number of the regressions at the higher bandwidths now suggest some covariate imbalance (Table A5). Such imbalances might be expected, of course, as bandwidths become larger and differences in girls’ ages increase. Bandwidths of 3 and 4 months imply a difference in girls’ ages (i.e., before and after the cut-off) of between 6 and 8 months, for which some dissimilarities in characteristics might reasonably be anticipated. In any case, all these differences are rather minor, and overall, we find our initial results very robust with regard to changes in bandwidth size and polynomial order.

Heterogeneity of SSA Effects

We test the heterogeneity of our treatment effects estimates across important disadvantaged subgroups of the female population, as defined by: mother’s level of education, household wealth, and ethnicity.

Previous research has shown that SSA effects are strongest among relatively disadvantaged groups. Cook and Kang (2016), for example, find that the impact of SSA on adolescent risky behavior—crime in this case—is significantly larger for those individuals with mothers who have relatively limited education and for families that are comparatively poor. This motivates the examination of SSA effects among subgroups of the population as defined by mothers’ level of education and household wealth here.Footnote 18 We also examine the heterogeneity of SSA impact across girl subgroups defined by ethnicity. The following paragraph stimulates our interest in impact heterogeneity across ethnic subgroups.

As mentioned earlier, Kinh is the dominant ethnic group in Vietnam and its members make up approximately 85% of the population. Non-Kinh ethnic groups, which are 53 in number, comprise the rest. Non-Kinh Vietnamese are significantly more disadvantaged than their Kinh counterparts. The non-Kinh poverty rate in 2010 was about 66%, for example, while that for the Kinh majority was about 13% (Badiani et al. 2013). In 2012, the per capita income of non-Kinh households was just 50% of that of their Kinh counterparts (McCaig et al. 2015). Child marriage and pregnancy rates among ethnic minorities are significantly higher than those found among Kinh as well (GSO and UNICEF 2015).

Table 6 presents the results of the heterogeneity analysis for 15–17 and 18–19-year-old cohorts. We estimate treatment effects for subgroups along the dimensions indicated above in the same manner as previously done. As the table demonstrates, the estimated treatment effects are consistently larger for disadvantaged groups compared to their advantaged counterparts for both teenage marriage and teenage motherhood outcomes.Footnote 19 All cases conform to the described pattern: SSA effects are more pronounced for disadvantaged groups, i.e., among girls from ethnic minorities, whose mothers have comparatively little education, and who live in relatively poorer households. Girls in these disadvantaged subgroups would benefit most from starting school later. For girls in advantaged groups, SSA has moderate or even insignificant effects on probability that girls in these groups marry or give birth.

Table 6 Heterogeneous effects of SSA among subgroups of girls (bandwidth is 2 months)

It is reasonable to hypothesize that household wealth may be driving the above treatment effects outcomes for those groups not explicitly defined as a function of wealth. That is, since non-Kinh and low education households are all relatively economically underprivileged it might be argued that the larger estimated SSA effects for those groups are merely a reflection of their relatively lower wealth and nothing more. We test this hypothesis below.

In Tables 7 and 8, we estimate the heterogeneous effects of SSA across subgroups defined by ethnicity and mother’s education, but we do so separately for groups of girls whose families are positioned in the lower and upper halves of household wealth (as defined by the median level of our index), respectively. Table 7 shows heterogeneous effects related to teenage motherhood only, for 15–17-year-old girls, while Table 8 provides the heterogeneous effects on both teenage marriage and motherhood for the for 18–19-year-old cohort. The results demonstrate that, in general, the patterns previously observed hold across both relatively low and high wealth classifications. That is, the relatively larger SSA impacts for disadvantaged groups as defined by ethnicity and mother’s education obtain regardless of household wealth. We conclude that although household wealth obviously matters in conditioning SSA impacts, other factors unrelated to economic status are important as well.

Table 7 Heterogeneous effects of SSA on teenage motherhood among subgroups of girls by wealth index, age cohort 15–17 (bandwidth is 2 months)
Table 8 Heterogeneous effects of SSA among subgroups of girls by wealth index, age cohort 18–19 (bandwidth is 2 months)

We conduct a robustness check for heterogeneous effects of SSA among subgroups of girls by using bandwidth 3 and 4 months around the cut-off date and a polynomial degree two. The results are presented in Appendix, Tables A6–A10. The tables show that although the magnitudes, and significance levels of estimated coefficients are sometimes different than those derived using a two-month bandwidth and a polynomial degree one, the signs (and differential magnitudes, where relevant) are typically consistent with those previously derived. More importantly, the qualitative conclusions are unchanged. For example, in the analysis of treatment effect heterogeneity by wealth, we now find that mother’s education has a strong and statistically significant negative impact on teenage marriage, both when mothers have no formal education and when they have a basic qualification (A9). However, the impact for mothers with no education is still larger (in absolute magnitude) compared to that for mothers with a basic qualification. Our main qualitative conclusions—that mother’s education has SSA effects independent of those based on wealth and that those effects are more important for disadvantaged households—remain valid. We conclude that our original heterogeneity analyses are robust with respect to changes in bandwidth size and degree of polynomial.

Summary and Conclusions

Although the importance of the impact of SSA on child education and labor force outcomes is widely acknowledged, SSA effects on teenage marriage and childbearing are relatively poorly understood, given the limited research heretofore undertaken, and the inconsistent results of that research. This paper examines the impact of SSA on teenage marriage—a first in the literature—and teenage motherhood in Vietnam.

In Vietnam, children start school in the year they turn six, and the cut-off date for school entry is January 1. The regulation leads to a discontinuity in the age that children start school, e.g., children born in December enter primary school 1 year earlier than those born in January of the following year. We exploit this discontinuity and employ regression discontinuity methods to identify and estimate the causal effects of SSA on teenage marriage and motherhood.

We find that girls born before the cut-off date are more likely to get married and/or give birth between the ages of 15 and 19 than those girls born after the cut-off. Starting school late reduces teenage marriage and motherhood by about a third in Vietnam. These are substantial effects. Although we are unable to make a judgement about the relative significance of peer effects in the determination of SSA impact on teenage marriage and motherhood due to a lack of data, we find strong evidence of incapacitation and human capital effects. We also determine that SSA impacts are quite heterogeneous across subgroups of girls. The harmful SSA effects associated with starting school early are more pronounced for relatively disadvantaged girls, especially those from minority ethnic groups, whose mothers have comparatively limited education, and whose households are relatively poor.

Girls in these subgroups would benefit most therefore from starting school later. This finding suggests that parents of disadvantaged girls may wish to carefully consider the age at which their daughters enter school and that government might consider a more flexible approach to the implementation of its school entry age regulation to allow such girls to start school later if their parents so desire.