Introduction

With the development of the education system, over the past decades, people around the world have experienced tremendous increase in educational attainment. The impact of education and related human capital investment decisions is thus a vital topic to study. In the short term, education produces knowledge and skills that individuals receive. In the long term, the return to education is reflected by the gains in the total years of schooling and in the labor market. In addition, the returns to female education are shown to improve the outcomes in healthier behavior and marriage market, which also yield intergenerational spillover effect that are beneficial to the children of females. The mechanism behind it is that education benefits females through increasing the likelihood of long-term labor market participation, information acquisition, and assortative mating. With higher income, females may choose to substitute toward less childbearing as well as more high-quality children—the quality–quantity tradeoff. Because the opportunity cost of childbearing including the forgone earnings and the time spent away from work is now larger. Increased health knowledge through education also leads to maternal behaviors that are conducive to better infant health, such as effective use of contraception, and fewer high-risk behaviors, such as smoking and drinking. Lastly, household education and income levels could also be augmented by assortative mating in the marriage market.

While the theoretical background that the topic is based on seems clear, the causal channels between female education, fertility, and infant health have only recently become the subject of academic inquiry. Quasi-experimental studies have used either educational programs or policies to look at the question in different countries. McCrary and Royer (2011) apply regression discontinuity design to the kindergarten entry policies in California and Texas, which have entry date cutoffs. Their arguments support a causal channel of female education on fertility and infant health, although the effects are not statistically significant. A study by Currie and Moretti (2003) based on college opening in the USA concludes that there exists a positive causal effect of maternal education on infant health. Chou et al. (2010) exploits the junior high opening in Taiwan and documents that increasing a mother’s schooling reduces infant mortality. Güneş (2013) uses the Compulsory Education Law in Turkey for the instrumental variable method and finds that increasing female schooling reduces teenage fertility and delays marriage.

This paper is the first to explore the causal impact of female school entry decisions and education in the context of China.Footnote 1 And it uniquely contributes to the existing literature by examining both fertility-related decisions and infants’ health to form a comprehensive point of view in a developing country setting. By making use of the most important education policy in China, which has a natural cutoff date based on the birth date, I employ the Regression Discontinuity (RD) and Instrumental Variable (IV) identification strategies. The policy I utilize is the Compulsory Education Law of China, which requires children who have reached age 6 by August 31st of that calendar year to enroll in primary school and complete 9 years’ compulsory education.Footnote 2 Specifically, this paper explores the following research questions in sequence: First, does the Compulsory Education Law (CEL) have a first-stage effect on female’s actual school starting age? Second, does the impact of the school starting age continue with a reduced form effect on female’s educational attainment measured by the years of schooling completed? Third, how will female education influence fertility-related decisions, such as probability of becoming mothers, age at first birth, and labor force participation (LFP)? Should there be no selection into having children, what is the causal impact of education on infant health in terms of low birthweight, prematurity, and number of sicknesses before reaching 1 year old? And last but not least, I also explore the possibility of assortative matching on the marriage market, as spouse’s age and education are alongside the mechanisms augmenting the household health resource and affecting the fertility-related health outcomes.

Simply identifying the effect of school starting age on the outcomes (years of schooling and onward) through OLS could result in omitted variable bias. For instance, omitted family preference is likely correlated with both the entry age decision and education and health outcomes. To solve the issue, RD and IV utilize the exogenous variation in entry age created by the birth date requirement in the CEL. Fuzzy RD uses the exogenous variation around the birthdate cutoff and measures the Local Average Treatment Effect (LATE) for individuals born just before and after the cutoff. The cutoff thus creates a natural instrument for the identification strategy. An important assumption in RD is that individuals around the cutoff are similar to one another. To test for this, I perform the tests for sorting around the cutoff and covariate balance tests. Furthermore, two different bandwidths are selected to check the robustness of the results in RD. Additionally, I use predicted entry age based on the Law as an instrument for IV strategy and estimate the LATE for the compliers across the calendar year to provide a supplemental angle of the results. Predicted entry age can only affect outcome variables through actual entry age, which satisfies the exclusion restriction.

The data used in this paper comes from the China Family Panel Studies (CFPS, 2015)—a nationally representative longitudinal dataset collected by the Institute of Social Science Survey of Peking University, China. The surveys were conducted every 2 years from 2010 to 2018. And I use the 2010, 2012, 2014, and 2016 waves as child sample is not available in 2018. For each wave, adult sample is manually merged with child sample using the mother’s id, so that the mother’s information and their infants’ outcome variables are connected. As the study is mainly focused on the post-policy effect, after data cleaning, the sample size of females born after 1980 is 11,670 and of these, 3,250 became mothers. Key variables provided by the merged dataset include birth year and month of individuals, primary school entry year, years of schooling, employment status, annual income, number of children, birth year of children, infant birthweight, infant gestational age, number of sicknesses before 1 year old, and spouse age and education level. Control variables include individual’s Hukou status at age 3,Footnote 3 province, ethnicity, and parents’ education levels. The critical Instrumental Variable, predicted age at school entry, is calculated as the precise age that the individuals are first allowed by the law to enter primary school based on her birth month. Precise and actual age at school entry used in the first-stage RD design was calculated based on the actual school entry year, birth year, and birth month of individuals.

This study documents that there exists a first-stage effect of the CEL on female actual school starting age in the RD model, which is that females born right after the cutoff date enter school 0.21 years later than their counterparts. The reduced form effect of the CEL on completed years of schooling is 0.59 years for females in the age group of 23 years and above. As estimated in this paper, the education policy does not lead to significant causal effect on the probability of motherhood and maternal age at first birth, which implies that the education effect is unconfounded by selection into motherhood. In other words, we are comparing females alongside the cutoff who are similar in fertility decisions. Therefore, there is no sample selection bias when I continue to explore the infant health results. RD reduced form estimates show that the policy reduces the probability of an infant’s low birthweight by 3.5-percentage points for those who were born to the mothers right after the threshold. It also decreases infants’ number of sickness before 1 year old by 1.52 times. There is no statistically significant effect found for prematurity. With evidence from existing research, I raise the point that the medical cost saved by the reduced infant birthweight from female education is substantial in the context of China. In exploring the mechanisms contributing to better infant health, there are no significant variations in female and maternal LFP or income level. For the marriage market outcomes, there is no effect on selection into marriage or spouse age. However, females born after the threshold have spouses with higher education levels, which supports the assortative mating channel. Compared to RD, IV method uses the variation across the calendar year. As maternal socioeconomic characteristics are showed to be correlated with month of birth,Footnote 4 I additionally controlled for season of birth. Full sample second-stage estimates on outcomes using Two-Stage Least Squares generate the LATE of the school entry age, instead of CEL policy. Compared with the LATE estimates from RD, both methods yield significant and noticeable results in the increase in female educational attainment and the decrease in infant low birthweight. For other outcomes, results suggest that they are less salient using variations across the calendar year in IV. The effect on compliers away from the threshold is smaller. For both methods, important assumptions of the identification strategy hold true, as proven by specification checks. The paper also finds that the results are heterogeneous and the strongest for individuals living in the most developed areas and for individuals with non-agricultural Hukou status, which suggest a potential story of the impacts coming from an even higher level of education obtained or from significantly higher income gained in the labor market.

The rest of paper proceeds as follows. In “Literature Review” section, I review the relevant strands of literature. In “Institutional Background” section, I describe the institutional background. In “A Simple Model of Maternal Education and Infant Health” section, I present a simple model linking maternal education and infant health, while the empirical models are presented in “Empirical Models” section. “Data and Descriptive Statistics” section describes the data. Results are presented in “Results” section. And I offer some discussions in “Discussion” section and conclude in “Conclusion” section.

Literature Review

The topic of this paper contributes to four streams of research. The first one is the research on female’s education and fertility outcomes. Female education may affect fertility decisions through income, preferences, or health knowledge related with fertility. One channel is that education increases female’s participation in the labor market and their income. According to Mincer (1962), women’s wage is negatively related to fertility, as it reflects the opportunity cost of maternal time. Becker (1981) and Schultz (1981) argue similarly about the time allocation and opportunity cost of time-intensive activities. Females may choose to substitute away from highly time-intensive consumption, such as childbearing. The economic theory of fertility is also related with the tradeoff between the quantity and quality of children (Becker & Lewis, 1973). Females with higher education may select to have fewer children but invest more in their health and human capital. Lastly, Rosenzweig and Schultz (1989) show that education increases females’ knowledge about contraceptive methods usage, which may reduce fertility. Research on the similar topic were also done in different country settings (Subbarao & Raney, 1995; Saleem, 2005; Bbaale, 2011; Gordon et al., 2011; Larsson et al. 2014).

Secondly, there have been some studies pointing out the channel through which female education affects children’s health. Grossman (1972), Thomas et al. (1991), and Glewwe (1999) show that education facilitates the information acquisition and processing through skills gained at school, which help shaping healthy pregnancy behaviors. Also, as documented in the literature, adults with higher levels of education are less likely to engage in risky behaviors, such as smoking and drinking, which would influence infant health (Currie & Moretti, 2003; Zimmerman et al. 2014). In addition, the increase in income through education additionally has bearing on children’s health by providing children more health resources. Such effect is corroborated in several studies (Conley & Bennett, 2001; Alaimo et al., 2001; Sakai et al., 2011; McCrary & Royer, 2011). Finally, in the marriage market, assortative mating by education or earned income also boosts the family income and resources (Behrman & Rosenzweig, 2002), which would be beneficial to children’s health. The mechanisms explored in this paper support some of the previous evidence.

Third, in drawing causal argument, there have been an increasing number of studies focusing on this topic recently using quasi-experimental methods. McCrary and Royer (2011) look at the topic using the kindergarten entry policies in Texas and California with exact date of birth. By exploiting the regression discontinuity design, they find that the policy affects education and mating market outcomes of young females who are at the risk of dropping out of school. The policy decreases the education at motherhood for both Texas and California, for those whose education is age dependent and is likely interrupted by having children. But, there are limited evidence for the effect on fertility and infant health. Currie and Moretti (2003) explore the question through college opening in the USA between 1940 and 1996 and conclude that higher maternal education improves infant health, increases the probability of marriage, and reduces maternal risk behaviors. Breierova and Duflo (2004) examine the topic using the time and region varying exposure to a school construction program in Indonesia as an instrumental variable and show that both female and male education are equally important in reducing child mortality. Chou et al. (2010) exploit the junior high opening in Taiwan and create the treatment and control groups separately. They find that parents’ schooling, especially mothers’ schooling, reduces the infant mortality. Güneş (2013) uses the exposure to the Compulsory Education Law across cohorts in Turkey as instrumental variable and draws the conclusion that female schooling reduces teenage fertility and delays marriage. Nguyen and Lewis (2020) study the school starting age in Vietnam on teenage motherhood in an RD setting using the school entry cutoff required by the Education Law. And they show that starting school earlier leads to a rise in teenage marriage and motherhood. Johansen (2021) uses Danish data and the threshold date requirement in school entry administrative rule and draws the conclusion that being young for grade leads to higher probability of abortion and alcohol poisoning for young women before age 20.

Finally, the topic of female education and fertility has been broadly studied in the USA. However, existing literature about the same topic in China are few. Zhang (1990), Chen and Shi (2002), and Chen and Deng (2007) explore several factors influencing fertility rate using multivariate analysis, where female education was one of the factors. More recently, Cui et al. (2019) looks at the mother’s level of exposure to the Compulsory Education Law (CEL) by age. And find that mother’s improved education increases adolescent offspring’s academic achievement and mental health. However, there are several major differences between Cui et al. (2019), and the most important one is Cui et al. (2019) that focuses on the 9-year compulsory education requirement and uses a maternal sample that are exposed to the CEL differently according to different age groups. Thus, the variation in the maternal years of schooling comes from some mothers who received the compulsory 9-year or more education and some who did not because they are not exposed to the CEL fully in 1986 at their age. On the other hand, this paper strengthens the cutoff date requirement component that results in differences in school starting age, which is the starting point and the core of this study.Footnote 5 In a working paper, Huang et al. (2018) also utilize maternal exposure to CEL by age group plus regional variation in the intensiveness of CEL implementations in a DID setting. The paper concludes that CEL increases abortion after a prenatal test, leads to women to avoid teen pregnancy, and decreases birth defects. It also uses an earlier cohort of females who are either exposed or not exposed to CEL in 1986 as a source of variation. The paper additionally utilizes previous province-level middle school incompletion rates as a proxy for CEL implementation intensity in the identification strategy. However, in my point of view, this is essentially invalid as there has been no quantitative evidence that provinces with suboptimal historical level of education implemented the CEL more intensively.Footnote 6 Chen and Guo (2022) similarly use the level of exposure to CEL as an IV and shows that female education reduces the number of births. The paper likewise uses the level of exposure by age group and a sample of mothers aged 35 and up by 2010. This means their sample is composed of mothers born between 1961 and 1975.Footnote 7 However, as further described in-depth in footnote 6 to 8, the above existing papers are fundamentally different from this paper, there have been no preceding literature on the causal analysis between female school starting age, female education, and fertility-related outcomes in China. Additionally, overall, much of the literature on related topics in various country settings treat the effect on female education and fertility-related decisions and children’s health separately. Therefore, to my knowledge, the distinctive contribution of this paper is that it is the first quasi-experimental study in China on female school entry age, and it posits a comprehensive view addressing the effects on long-term educational attainment, fertility decisions, and infant health jointly.

Institutional Background

Education System in China and the Compulsory Education Law

The education system in China is composed of Primary, Secondary, and Post-Secondary levels. Individuals start primary school at age 6 (or 7 in some areas). Primary education usually lasts 6 years. The secondary education lasts for between 6 and 7 years and includes a combination of a 3-year junior high following the primary school and either a 3-year academic senior high school or 3 to 4 years of vocational high school. The completion of junior high school marks the end of the 9-year compulsory education as required by China’s CEL. At the end of the CEL, students can choose to go to a vocational secondary school or go through a high school entrance examination to get into academic senior high school after junior high. As of 2016, enrollment in academic senior high school comprises 63.3% of the total high school-level education enrollment.Footnote 8 Post-secondary or Tertiary Education in China is accessible after taking the National Higher Education Entrance Examination. It covers post-secondary vocational education, bachelors-level, masters-level, and doctorate-level education.

The CEL is the most important education law in China. It was established in 1986 and took effect on July 1st of the year. The law creates a national 9-year compulsory educational framework and requires individuals to complete primary school and junior high to finish the 9-year education. This compulsory education can be obtained with no tuition or miscellaneous fees.Footnote 9 Article 11 of the Law states that, “Any child who has attained to [sic] the age of 6, his/her parents or other statutory guardians shall have him/her enrolled in school to finish compulsory education. For the children in those areas where the conditions are not satisfied, the initial time of the schooling may be postponed to 7 years old.” The threshold date requirement of August 31st is documented in the interpretation of Article 11 from the Legislative Affairs Commission of the National People’s Congress Standing Committee.Footnote 10 Individuals are first allowed to enter primary school upon reaching age 6 by the threshold date of a given year. Both the Ministry of Education of China (the national education department) and the Department of Education (Jiaoyu Ting, for municipalities and provinces) were also contacted to confirm that August 31st has been the threshold date for primary school entry since the CEL was implemented across the country.Footnote 11Footnote 12 The timing of CEL implementation varied by province and municipality, as shown in Appendix Table 9. Since the majority of the provinces implemented the Law between 1986 and 1988, the first birth cohort affected by the Law is the group born in or after 1980. Using the sample from this study, Fig. 1 illustrates that the policy had an immediate effect of decreasing primary school entry age, which means the compulsory requirement influenced vast majority of school age children in China to enroll in primary school in time. The average primary school entry age by birth year reduces from 7.4 years old before the policy to around 6.7 years old after the implementation of the CEL. A noticeable jump in the entry age for the birth year of 1981 corresponds to the first birth cohort influenced by the CEL.Footnote 13

Table 1 Summary statistics of mother sample, CFPS
Fig. 1
figure 1

Average Primary School Entry Age of Female Adults by Birth Year, CFPS. The figures are generated using Stata. This figure shows the average primary school entry age plotted against birth year of individuals. The data are from the female sample in the China Family Panel Studies (CFPS), wave 2010, 2012, 2014, and 2016

The Hukou System and the One-child Policy in China

Another important institutional concept used in this study as a control and a criterion for heterogeneous analysis is Hukou status. Hukou is the system of house and social identity that originated in ancient China and functions as a system of population management. Contemporary Hukou records each resident’s identifying information from birth, such as name, date of birth, names of immediate family, and address. There are two types of Hukou status: agricultural and non-agricultural. The status is determined based on the place of residence (rural or urban) and the parents’ Hukou status at the birth of individual. For instance, if an individual was born in the urban area and both of her parents have agricultural Hukou status, then she will have an agricultural Hukou. Differences in Hukou status define the social benefits an individual has, ranging from education to health insurance (maternity and new-born health insurance), retirement pension, and land-use rights. For education, schools usually give enrollment priority to children with local Hukou. As a result, the majority of children with agricultural Hukou status reside in rural areas for their local education, although their parents could be migrant workers who live and work in cities. As for differences in insurance, rural populations, if working on self-owned farmland and not otherwise employed, are not eligible to participate in maternity insurance that needs to be processed through an employer. However, self-employed farmers are eligible for the New Rural Cooperative Medical Insurance. This insurance is only open to individuals with agricultural Hukou Status and includes maternity and new-born insurance. Because inequalities in both educational options and fertility-related insurance exist between rural and urban areas, it is important to look at the heterogeneous effects between agricultural and non-agricultural Hukou individuals in later estimations.

The other policy that might raise concern for this paper’s topic is the One-child policy of China, which was introduced in 1979 and required Han Ethnic families to have one child.Footnote 14 However, the policy will not interfere with the identification strategies used in this paper. The Regression Discontinuity design identifies the treatment effect for the population born after 1980 and around the August cutoff, who are equally regulated and affected by the One-child policy. Therefore, the policy does not invalidate the assumption that individuals on either side of the cutoff are similar. In addition, it is irrelevant to the assumptions behind IV, such as the Relevance Condition and Exclusion Restrictions. Consequently, the institutional background regarding the One-child policy will not have any impact on this paper’s estimations.

A Simple Model of Maternal Education and Infant Health

As outlined in previous theories and in this paper, infant health is determined by the maternal genetic endowment which is a Nature element and a Nurture element that covers maternal education- and fertility-related choices. Based on Grossman’s (1972) concept of health capital and model developed by McCrary and Royer (2011), a simple model for the infant health production function is as follows:

$$Y = f(\ell (S,I(S),H_{0} ),H_{0} ),$$

where Y is the measure of infant health, \(S\) is the maternal education, \(I\) is the resources that result from maternal choices in the labor market, mating, and health behavior or health knowledge, and \({H}_{0}\) is the infant’s initial health endowment coming from mother. In this health production function, the first term \(\ell\left(S, I\left(S\right), {H}_{0}\right)\) is the function for maternal demand of health inputs, the Nurture element. Note that \(I\) is also a function of maternal education, as schooling affects the fertility-related choices listed above. And this paper specifically focuses on the first term. The hypotheses implied by the model are that education directly and positively affects maternal health inputs through the increase in maternal health knowledge; there is also the indirect positive effect from improved health resources for infants through labor market outcomes, like increased income or childcare time (reduced LFP), and marriage market outcomes, like spouse with higher education. These latter factors stem from education as well.

Empirical Models

In this session, I discuss the identification strategies I use to identify the relationship between females’ school entry age and long-term outcomes in educational attainment, fertility decisions, and their child’s health in infancy. The variation in the endogenous variable, school entry age, is induced by the birth months of individuals and the policy requirement. First, I consider the baseline OLS regression:

$$Y_{it} = \delta_{0} + \delta_{1} Actual\_age\_entry_{i} + \phi \chi_{i} + C_{t} + \mu_{it},$$
(1)

where i indexes female individuals, t indexes survey year, \({Y}_{it}\) denotes the long-term outcome variables, \({Actual\_age\_entry}_{i}\) is the actual school entry age, \({\text{X}}_{\text{i}}\) represents a set of time-invariant control variables ,such as birth year, ethnicity, province, Hukou status at age 3, and parental education levels, \({C}_{t}\) is the survey year fixed effect, and \({\mu }_{it}\) represents unobserved determinates of outcome variables. \({\delta }_{1}\) is the coefficient of interest. OLS identifies the treatment on the treated effect. However, identification using OLS is problematic because variation in the actual school entry age is endogenous. There could be omitted factors correlating with both the treatment variable and the outcome variable. For instance, family preference of school entry age due to family background will potentially impact the educational attainment or fertility decisions as well. According to the literature, families with low socioeconomic backgrounds are likely to comply with school entry policies (Elder & Lubotsky, 2009). Individuals with such a background may have limited resources for education or health knowledge. Although parental education levels are controlled for in the regression, it is hard to find a proxy for family preferences. Another example is that ability is shown to be correlated with both female education and fertility decisions (Griliches, 1977). Therefore, the estimation from OLS is likely biased. Regardless, the results are presented in tables for comparison.

To identify the clean variation in treatment, I first use the Regression Discontinuity (RD) design. As the CEL sets a natural cutoff date—Aug 31st individuals born just after the cutoff date who have not reached age 6 (or 7 in some provinces) are required to postpone primary school entry for one year. The cutoff can be utilized as an exogenous variation in the timing of school entry, with the running variable being female individuals’ birth month. As birth month does not perfectly determine the actual entry age, this is a fuzzy RD design. The Indirect Least Squares estimation equations are as follows:

$$Actual\_age\_entry_{i} = \alpha_{0} + \alpha_{1} Above_{i} + \alpha_{2} Month_{i} + \alpha_{3} Month_{i} \times Above_{i} + \theta \chi_{i} + C_{t} + \varepsilon_{it},$$
(2)
$$Y_{it} = \beta_{0} + \beta_{1} Above_{i} + \beta_{2} Month_{i} + \beta_{3} Month_{i} \times Above_{i} + \gamma \chi_{i} + C_{t} + \nu_{it},$$
(3)

where i indexes females, t indexes survey year, \({Above}_{i}\) is the dummy variable that indicates whether an individual was born after the cutoff date, \({Month}_{i}\) represents the relative birth month of individuals to August,Footnote 15\({X}_{i}\) is the same set of control variables as in OLS, \({C}_{t}\) is the survey year fixed effect, and εit and νit are the error terms. Errors are clustered by individual.Footnote 16 Here, Eq. (2) is the first stage that analyzes the effect of primary school entry policy on female’s actual school entry age, where α1 is the coefficient of interest. Equation (3) is the reduced form of the impact of policy on the long-term outcomes, with β1 as the coefficient of interest. The means comparison of RD is the difference on either side of the cutoff. Dividing the reduced form coefficient β1 by first-stage coefficient α1 gives us the LATE for females around the cutoff, which is also the causal effect of school entry age, instead of the policy, on the outcome variables. The first-stage and reduced form coefficients are presented in the later results. And the LATE is also discussed. An important assumption behind RD is that individuals around the cutoff are essentially similar except for the treatment status. We should expect to see continuity at the cutoff for the other characteristics. The assumption would fail if there is a jump otherwise, another policy using the same cutoff, or manipulation of the running variable. Later in the results section, covariate tests and a manipulation test of the running variable are performed to indicate that the assumption holds. In addition, as discussed in the institutional background above, no other policy would interfere with the identification using the same cutoff date.Footnote 17

The RD strategy gives unbiased estimates of the treatment effect at discontinuity. It also utilizes the known rule that is common in the design of social policy. One limitation, however, is that as it measures the LATE close to the cutoff, the results may not always be generalizable. An alternative and supplemental identification strategy I explore other than RD is Instrumental Variables (IV). The instrument I use is the predicted primary school entry age, which is the precise age that the individuals are first allowed to enter primary school by the CEL. The Two-Stage Least Squares (2SLS) equations are given by the system:

$$Actual\_age\_entry_{i} = \theta_{0} Predicted\_age\_entry_{i} + \theta_{2} Seasonality_{i} + \pi \chi_{i} + C_{t} + \tau_{it},$$
(4)
$$Y_{it} = \rho_{0} + \rho_{1} Actual\_age\_entry_{i} + \rho_{2} Seasonality_{i} + \varphi \chi_{i} + C_{t} + \omega_{i},$$
(5)

where i again indexes female individuals. t indexes survey year. \({Predicted\_age\_entry}_{\mathrm{i}}\) is the predicted primary school entry age. The calculation method is illustrated below in the data section. \({X}_{i}\) represents the same set of covariates, \({C}_{t}\) is the survey year fixed effect, and both \({\tau }_{it}\) and ωit are the error terms. Errors are clustered by individual. Additionally, a factor variable \({Seasonality}_{i}\) is controlled for, because IV uses the variation across the whole calendar year. Season of birth is proven to be correlated with family background and maternal characteristics by previous literatures (Bound & Jaeger, 2000; Buckles & Hungerman, 2013). Fuzzy RD sidesteps the seasonal patterns, as Fuzzy RD can be seen as a local IV around the cutoff. On the other hand, the IV identification uses more of the data across the calendar year. Equation (4) and (5) are the first and second stages, respectively, that yield the estimate of ρ1, which measures the LATE of the compliers based on the predicted school entry age. That is, compliers are those whose actual school entry age are affected by the predicted age.Footnote 18 OLS will be sufficient if there is either full compliance with the law or if non-compliance with the law is random. However, as discussed above, omitted variable bias exists in OLS estimation. IV thus helps to solve this issue. In terms of the IV assumptions, the correlation between predicted and actual school entry age satisfies the relevance condition, as significant first-stage effect exists. Furthermore, predicted age can only work on the outcome variables through the actual entry age. In the results session, the estimations of ρ1 and the F-statistics from the first stage are reported.

Data and Descriptive Statistics

The main dataset used in this paper comes from the China Family Panel Studies (CFPS), a nationally representative longitudinal survey launched by the Institute of Social Science Survey (ISSS) of Peking University, China. Data from baseline year 2010 and follow-up years 2012, 2014, and 2016 are applied to the analysis. The baseline sample covers 25 provinces and municipalities, representing 95% of the Chinese population. Surveys were conducted at the community, family, and individual level. A total of 14,960 households and 42,590 individuals were interviewed in 2010.Footnote 19 In this paper, only individual level data are used.

The advantage of CFPS is that it provides critical data regarding variables relating to individual’s birth information (birth month and birth year) and primary school entry year. Birth month is used as the running variable in RD.Footnote 20 Together with birth year and primary school entry year, I am able to deduce the precise primary school entry age and predicted entry age for IV, as illustrated below. In addition, individual-level data in CFPS includes adult sample and child sample,Footnote 21 and both are used in this paper to identify maternal and infant information. Maternal years of schooling, LFP, annual income, number of children, marital status, and spouse characteristics are the key-dependent variables provided by the adult sample. Infant age (for calculating a mother’s age when giving birth to the first child), birthweight, gestational age, and number of sicknesses before 1 year old are the key-dependent variables provided by the child sample. The child sample and adult sample are merged through maternal individual IDs. Other predetermined or time-invariant variables were also collected at the time of survey, such as Hukou status at age 3, province of residence, parental education, and ethnicity. These variables are used as controls. Although questions on individual’s risky health behaviors like drinking and smoking were also asked in the survey, unfortunately, the nonresponse rate is high, which leads to drastically reduced sample sizes to draw any causal conclusions.Footnote 22

As the actual (precise) primary school entry age is not available in the data, it is calculated using school entry year, individual’s birth month, and birth year by the following equation:

$$Actual\,Entry\,Age_{i} = \left( {Entry\,Year_{i} - Birth\,Year_{i} } \right) + (9 - Birth\,Month_{i} )/12.$$

Each variable in the equation corresponds to either individual-level primary school entry information or birth information in CFPS. For example, if an individual was born on August 1982 and entered primary school in 1989, her actual precise primary school entry age will be (1989–1982) + (9–8)/12 = 7.08. The instrumental variable, predicted primary school entry age, is calculated using the similar logic. It reflects the precise predicted age in September of the calendar year that the individuals are first allowed to enter primary school, by the requirement of the law. The equation is as follows:

$$Predicted Entry Age_{i} = 7 - (9 - Birth\,Month_{i} )/12$$

, if born from September to December.

$$Predicted Entry Age_{i} = 6 + (9 - Birth\,Month_{i} )/12$$

, if born from January to August.

Using the same example as above, although the individual actually entered primary school when she was 7.08 in age, her predicted entry age would be 6 + (9–8)/12 = 6.08 by calculation. It means that if she complied with the law, 6.08 was her age when she was first allowed to attend school.

Combining four waves of data, the sample size of uniquely identified mothers surveyed in CFPS is 12,174. Selection criteria for samples in the analysis are based on individuals’ birth year and elimination of measurement errors. Since the CEL only affects birth cohorts after the year 1980, those born before 1981 are dropped. To eliminate reporting errors for actual primary school entry age, individuals with an outlier actual entry age are excluded from the sample, which are those who either entered school younger than 4 years old or older than 9 years old within this sample. As this paper studies the intergenerational spillover effects, we also want to avoid selection bias due to the different preferences for the number of children to have or the varying distribution of household health resources for children. Therefore, only first-time mothers are selected.Footnote 23 The final sample consists of 3,001 mothers. Table 1 shows the summary statistics of the sample. The average actual entry age is 7.39 years old. The mean completed years of schooling is 9.32,Footnote 24 which means that, on average, mothers approximately completed the 9-year compulsory education. And mothers’ average age at first birth is 22.56. The maternal LFP rate is 59%. And individual’s log annual income has a mean of 9.54. 4% of the infants suffer from low birthweight and 5% are born premature. On average, infants have 3.59 times of sickness before reaching 1 year old. Spouses of the individual mothers have a mean age of 29.45 and 7.06 years of schooling. In terms of season of birth, about 28% of the sample was born in the fall season, which makes up a slightly larger proportion of the sample compared with other birth seasons. A later manipulation test of birth month shows that this is a trend that existed before 1980 as well. Thus, it will not influence the RD estimations. About 40% of the sample are urban residents. 89% belong to the Han Ethnicity. 93% of individuals in the sample have Agricultural Hukou Status at age 3 and the proportion decreases to 84% for the status at the time of survey. Average age of the sample is 28.17 years old.

In addition to CFPS, in the discussion session, a different dataset, the China Labor-force Dynamics Survey (CLDS), is used to provide reference estimations for the Regression Discontinuity results on female education and fertility outcomes.

Results

The results of this paper are arranged as follows. First, I present the first-stage effect of the policy on females’ actual school entry age. Second, conditional on the significant first stage, I continue to explore the reduced form effect on years of schooling, and I conclude that the discontinuity in education only exists for females older than 23 who likely have completed their education. Third, I examine the reduced form impact of school entry policy on fertility- and health resource-related outcomes and I compare with OLS and IV estimates. This comparison shows that for none of the estimates is there a statistically significant effect on the maternal fertility decisions or labor market outcomes that relate to family’s health resources. Fourth, since there is no concern regarding selection into motherhood, I then present both the RD and IV estimates of the effect of female education on infant health in terms of low birthweight, prematurity, and number of sicknesses before 1 year old. For both RD and IV estimates, it is concluded that female education reduces the probability of infant low birthweight and infant illness. Fifth, I further explore the results on the marriage market outcomes and find no statistically significant effect on spouse age. But there exists a minor effect around the threshold in more educated female matching with a spouse who has higher educational attainment. Sixth, I show that there are heterogeneous effects based on Hukou status and economic region. Lastly, I present the robustness check and specification checks for RD.

School Entry Policy and Actual School Entry Age

Figure 2 is a graphical expression of the unconditional predicted first-stage effect and actual effect. It shows the relationship between birth month and predicted age or actual entry age. The dash line of predicted entry age shows that under a perfect compliance with the law, there will be an 11-month jump in entry age for the cohorts born in September. This is because following the cutoff on August 31st, individuals are required to wait until the following fall to enroll in primary school. On the other hand, actual compliance is reflected by the connected line of the actual entry age, which displays a declining trend on either side of the cutoff. Nonetheless, only less than 0.3 years (3-months) discontinuity is found at the cutoff. This is a virtual display of the first-stage effect of CEL on actual entry age for the female sample. Figure 3 demonstrates the conditional first-stage RD effects more specifically. Estimates shown on the lower left-hand side of the graphs are the coefficients and standard errors from the equation. For the full female sample, the discontinuity in entry age is 0.21 years with a 0.08 standard error. The discontinuity is similar by age group for females older than 23 years old and those who are younger. The statistically significant results suggest that females who are born after the cutoff date are on average 0.21 years older than their counterparts born before the cutoff. It also means that the compliance rate of the law around the cutoff is about 22.9%.Footnote 25, Footnote 26

Fig. 2
figure 2

Predicted vs Actual Primary School Entry Age: The Visual Impression of First Stage, CFPS. This figure presents the unconditional first stage of entry age plotted against the birth month of individuals. Data are from the female sample in CFPS, wave 2010 to 2016

Fig. 3
figure 3

The RD First-Stage Effect of School Entry Law on Actual School Entry Age, CFPS. This figure shows the first-stage effect of school entry law on female’s actual school entry age, for the full-sample and by age groups. Data are from the female sample in CFPS, wave 2010 to 2016. Relative month is the difference between the actual birth month and August (8). Therefore, “-7” means the January birth month and “4” means the December birth month

School Entry Policy and Years of Schooling

Because significant effect exists for the first stage, I proceed to look at the impact on years of schooling. School entry policy could potentially affect education for females who are still enrolled in school and for those who have already completed schooling. Compared to the educational system in the USA or some other countries, there is no minimum dropout age requirement in China, but rather a 9-year compulsory education requirement. Therefore, in the context of China, for those who are still enrolled in school to finish the 9-year education, starting school at a later age will not exert an impact on the years of schooling during this time, even for those who stay at school beyond the common age for junior high completion. Because individuals are not incentivized for early dropout due to starting school late and reaching the minimum dropout age early but with lesser years of schooling completed. Instead, variation in school starting age only affects the age of junior high completion. After the 9-year compulsory period, an individual’s decision on schooling could be age dependent or schooling dependent. Under the age-dependent school-leaving decisions, a female of a given age who starts school later will have fewer years of schooling compared to those who start months earlier. Examples of such age-dependent reasons could be either starting to work once reaching the minimum working age,Footnote 27 or giving birth to a child by certain age, which increases a female’s risk of dropping out of school. Empirically in this paper, however, the effect on years of schooling is the strongest for females older than 23 who likely have completed their full education. Table 2 presents the RD reduced form estimates by age group. On one hand, for both females and mothers under the age of 23, positive discontinuities are documented with large standard errors with or without controls. This means that the school entry law does not influence the sample individuals who are likely still enrolled in school, whether they are first-time mothers or females. This is possibly due to the fact that there is not much variation in schooling, as all individuals who are enrolled in school are finishing or have finished the 9-year compulsory education. In addition, the average age of females giving birth to their first child is around 22.57 as shown in the descriptive statistics, which is not an age that would cause females to drop out of school early. On the other hand, individuals above 23 years old are likely to have completed their years of schooling. As displayed in Fig. 4, using the data from CFPS, the fraction of females enrolled in school declined to 10% at the age of 23. In China, age 22 or 23 is also the average time that individuals complete their college education. Therefore, the right panel of Table 2 indicates the effect on the completed years of schooling. For the female sample, the estimates controlling for covariates is 0.59, which is statistically significant at the 1% level and suggests that females born right after the cutoff completed 0.59 more years of schooling.Footnote 28 The graphical result is presented in Fig. 5. The estimates are similar for the mother sample in magnitude and significance level. First-time mothers who entered school 11 months later following the policy requirement also obtained 0.52 more completed years of schooling.Footnote 29 The results suggest that, stratifying by age, the causal effect of school entry policy on educational attainment is most relevant for women who are older than 23 and who likely have already completed their education in China. Thus, the causal analysis of later outcomes focuses on the sample of women who are older than 23, for whom the effect on schooling is the strongest.Footnote 30, Footnote 31

Table 2 Effects of School Entry Law on Years of Schooling by Age Group and Regression Discontinuity (RD)
Fig. 4
figure 4

Fraction of Females Enrolled in School by Age, CFPS. This figure shows the fraction of females enrolled in school by age. Data are from the female sample in CFPS, wave 2010 to 2016

Fig. 5
figure 5figure 5

Effects of School Entry Law: Conditional Scatter Plots, CFPS. This figure shows the reduced form conditional scatter plots of separate outcomes. Means, linear fitted values, and 95% confidence intervals are presented for each plot. Discontinuities and standard errors are displayed on the lower left corner of each plot. Data are from CFPS, wave 2010 to 2016

Table 3 reports the complete results of RD reduced form, RD LATE estimates, IV LATE, and OLS on the completed years of schooling.Footnote 32 Again, the difference between RD reduced form and RD LATE estimate is that RD reduced from measures the effect of the law itself on the outcome(s), and RD LATE measures the effect of entering school one-year later on the outcome(s). RD LATE (or RD estimates) is calculated by dividing the reduced form coefficient with the first-stage coefficient—0.21 from “School Entry Policy and Actual School Entry Age” section. While OLS is subject to bias due to omitted variables that relate to both dependent and independent variables, IV estimates using predicted school entry age avoid these potential problems. Row 5 of Table 3 shows the causal effect of school entry age on mothers’ completed years of schooling, controlling for covariates. It implies that compliers with the law completed 1.58 more years of schooling at a 1% significance level. Combining the result from RD estimate, which is 2.80,Footnote 33 it can be concluded that the causal effect of entering school one year later results in over 1.5 more years of schooling for the compliers. The result is robust whether using the full sample across the year or the discontinuous estimation in RD. Note that the results in this paper are contrary to Dobkin and Ferreira (2009), which states that early entry into school increases educational attainment in the USA and intuitively fits the minimum dropout age story. The estimates are also larger in magnitude compared with those in existing literature in other country settings.Footnote 34Results here could imply that females who enter school later will complete the same or more years of schooling through better in-school academic performance or grade progression.Footnote 35 It also inferred that the CEL does not disrupt female’s short-term education decisions for younger females.

Table 3 Female Completed Years of Schooling, RD, IV, and OLS

School Entry Policy, Fertility Decisions, and Relevant Labor Market Outcomes

The effect of education on fertility decisions manifests itself in this paper through estimates of the probability of motherhood and the maternal age at first birth. Probability of motherhood is measured by the fraction of mothers within the same birth month and age cohort of females. As with the analysis on education, the following results are restricted to the sample of mothers aged 23 and above. As mentioned in the literature review, the assumption is that females with higher educational attainment will have lower fertility rates or delay the age of childbearing. This is because education increases the opportunity cost of having children through higher income, as well as greater female health knowledge, such as the usage of contraception. However, the overall estimates in this study give no indication that female fertility decisions are affected by the timing of her school entry. Specifically, in Table 4 Row 1, the RD reduced form estimates show a 0.01 decrease in probability of motherhood for the female sample and a 0.11 increase in the maternal age at first birth when regressing fertility outcomes on females’ birth month. Dividing the reduced form results by the first-stage estimate, 0.21, we get a LATE effect of actual school entry age on probability of motherhood of -0.05 and maternal age at first birth of 0.57 (Row 4). However, because the standard errors are large, the effects are not significant. Similarly, for IV estimates, the effects of school entry age on the two outcomes are insignificant results of -0.19 and 1.12. The magnitude of the IV estimates is slightly larger than RD, as IV uses variation caused by the instrument across the year for the estimations. Nevertheless, both methods suggest that the difference in school entry age has a limited causal role on females’ fertility planning. This result is corroborated by the supplemental dataset—China Labor Force Dynamics Survey (CLDS) in Table 13. In the context of China, societal pressures on fertility could exist. Families tend to prefer that females have children and give birth to their first child at an earlier age, so as to carry on the family line. This societal pressure could be the reason for the insignificant effect on fertility, such as the evidence shown in a few province-level studies (Loke et al., 2012; Luo & Mao, 2014, etc.) in China. Additionally, the results for the probability of becoming mothers indicates that there is no selection into having children for females born around the cutoff date. Therefore, I can safely continue to explore the effect on infant health without sample selection correction. Other than the direct Feliu and Yrtility decisions, labor market returns to schooling provide indirect measures of maternal time for childcare and available household health resources. Table 5 shows the complete results. No economically or statistically significant effect is detected for maternal LFP rate or log income, although the sign of the discontinuity on income around the cutoff confirms the well-documented positive effects of education on earnings. Null effect on LFP is also found for females who do not have children yet. Similar insignificant effect is found again using CLDS in Table 13. The insignificant LFP does present a puzzle, as individuals with higher levels of education are much more likely to participate in the labor market and have better labor market performance because of human capital accumulation (Becker, 1962; Mincer, 1974). I explore whether there is variation in female’s childcare responsibility that would crowd out the time for LFP. Using the age of the youngest child and probability of having a child under the age of 3 does not yield any economically or statistically significant effect. There is not much heterogeneous effect by Hukou status or areas (Table 8), although females with non-agricultural Hukou status experience an increase in income which may be the human capital effect. The well-cited paper of Dobkin and Ferreira (2010) on the US school entry age and long-term labor market outcomes also does not find any significant results for wages and probability of employment. The result may just imply that gain in educational attainment induces nearly zero effect to the long-term labor market outcomes. The results potentially rule out the mechanism of differences in maternal time spent on childcare provision or in income resources having spillover effects toward the health of the next generation.Footnote 36 It also shows alongside that there is no induced discontinuity in females returning to the labor market after having a child.

Table 4 Fertility decisions, RD, IV, and OLS
Table 5 Labor market outcomes, RD, IV, and OLS
Table 6 Infant health, RD, IV, and OLS

School Entry Policy and Infant Health

Proceeding from the above analysis on maternal fertility decisions, I continue to examine the effect on infant health in terms of low birthweight, prematurity, and infants’ number of sicknesses before 1 year old. Infant low birthweight is measured by the World Health Organization’s standard of 2499 g or less. We see significant decreases in both the RD reduced form estimates and IV estimates with controls. Reported in Table 6, the RD reduced form indicates a 3.5-percentage point decrease in infant low birthweight induced by the school entry policy. This discontinuity is also corroborated by the conditional scatter plot in Fig. 5, where there is a clean and significant jump around the cutoff. The IV estimate of − 0.033 indicates a 3.3-percentage point decrease induced by mothers starting school one year later. Comparing females born in the 4 months after the cutoff and enter school later to their counterparts, the percentage increase is 54%. The effect shown is very strong but not uncommon. Two meta-analyses (Godah et al., 2021, Silvestrin 2013) of maternal education and infant low birthweight reveal very similar conclusions—high maternal education (in terms of educational attainment) conferred an over 30% protective effect against LBW compared to low education in various country settings. Based on the theoretical background, mothers may have gained more health knowledge through more years of schooling, therefore opting for better and healthier pregnancy planning. As emphasized in Grossman (1972), Glewwe (1999), and Lleras-Muney (2002), the causal path exists in that both temporary and permanent changes in schooling may affect learning and the ability to process (health) information. The second outcome, infant prematurity, is defined by being born at fewer than 37 weeks of gestational age. The scatter plot in Fig. 5 shows no break in behavior, which is substantiated by the RD reduced form result in Column 2 Row 1. The estimate is − 0.014 with a standard error of 0.02. The IV estimate is likewise small and insignificant as the data points of prematurity are noisy across the calendar year. Another outcome I explore below is infants’ number of sicknesses before 1 year old. Number of sicknesses, in the questionnaire, counts the number of times the infant went to hospital based on the number of completed courses of treatments. The RD reduced form estimate shows that the policy significantly reduces the number of sickness by 1.52 times. The LATE estimator from the IV estimate is − 5.84. However, IV captures the noisy distribution across the calendar year and is thus not significant. Compared to the existing literature in the USA, results on infant low birthweight in China as showed in this paper is larger. Currie and Moretti (2003) find that increased female education induced by college openings in the USA reduced low birthweight and preterm delivery by 2 and 1%, respectively. And McCrary and Royer’s research utilizing entry age policies in the USA does not find any significant outcomes on the same two outcomes. According to Tang et al. (2017), although China has a lower low birthweight rate than many other countries, because of the large population size, the number of infants with low birthweight actually contribute significantly to the global number. In a study (Lin et al., (2015)) conducted at 26 Neonatal Intensive Care Unit (NICU), the cost of NICU stay for non-surviving infant with extreme low birthweight averaged about 10,244 CNY (1,580 USD), and it is about 62,206 CNY (9,610 USD) for surviving infant. Therefore, the medical cost saved by the reduced infant birthweight from female education should be substantial.Footnote 37, Footnote 38

Table 7 Marriage market outcomes, RD, IV, and OLS

School Entry Policy and Marriage Market Outcomes

Under assortative mating, a female’s age and education are causally related to her spouse’s (Behrman & Rosenzweig, 2002). Spouse education levels may affect household income level and influence female fertility decisions and health outcomes of infants. I thus explore the causal effects of the policy and delayed school entry age on the probability of marriage as well as spouse age and education. For the first outcome, it is shown that the effect of school entry policy and the corresponding increase in education attainment on the probability of marriage is around zero. In Table 7, RD estimate is − 0.005 (Column 1, Row 4), and IV is 0.01 (Column 1, Row 5). The small or close to zero causal effect of women’s education on the probability of marriage is supported by previous literature (e.g., Lefgren & Mclhtyre, 2006; Yang, 2022). Insignificant results for the probability of marriage also rule out the selection into marriage issue, and I can continue to look at the spouse characteristics. For spouse age, there are negative but insignificant estimates as well, with the RD estimate being − 0.03 and the IV estimate being − 2.29 in Column 2. Intuitively, negative results are in accordance with the fact that females born right before the cutoff have older peers within their grade level, and females born right after the cutoff have younger peers within their classes. As to the education level of spouse, when females are born subsequent to the cutoff, it results in significant 1.15 more years of education of her spouse in the RD reduced form estimate (Column 3). The estimate is consistent with the assortative mating theory of the pairing of education levels between men and women, since women born just after the cutoff achieve a higher level of education. Higher spouse education may indirectly affect infant health through augmented family resource and health knowledge. For the variation across the year, entering school according to the law leads to 0.41 more years of spouse education in the IV estimate. However, due to large standard errors, the IV estimate in Row 5 is not statistically significant.

Table 8 Heterogeneous Effects by Hukou Status and Region: RD Reduced Form Estimates

Heterogeneous Effect

The effect of school policy on female education, fertility decisions, and infant health is plausibly heterogeneous with the socioeconomic and family background of females. For example, Currie and Moretti (2007) finds that intergenerational transmission of low birthweight is stronger for mothers in high poverty areas in the US. As described in the institutional background, Hukou status is related with the education and social welfare that individuals have. In addition, because the income level varies across provinces in China, separating the analysis by regional economic status will be helpful for our understanding of the results. Here, regional economic status is determined by disposable per capita income. According to the 2014 income statistics from the National Bureau Statistics of China, the most developed areas cover 10 provinces and municipalities along the east coast and Inner Mongolia with the disposable per capita income of over 3,000USD. Less-developed areas are mainly distributed in the west and southwest part of China with per capita income of less than 2,000USD. And the remaining areas are modestly developed defined by the income level from 2,000USD to 3,000USD.Footnote 39 Table 8 shows the RD heterogeneous effects by female Hukou status at age 3 and regions. First of all, the heterogeneous effects when separating by groups are generally consistent with the overall effects. Both the separating effects and the overall effects indicate that the school entry policy induces delayed school entry age and significantly more years of schooling for females older than 23 and born just after the cutoff date. Variation in educational attainment does not causally affect fertility decisions or mating market outcomes. However, educational attainment works as a channel toward significantly less infant low birthweight. More specifically, by groups, females with non-agricultural Hukou status have higher compliance rate with the policy (0.97) as indicated by the effect on actual school entry age. Similarly, individuals born in the most developed areas also have a higher compliance rate (0.50) compared with those from less-developed areas or poor areas. The two results are comparable, as the spatial distribution of the urbanization rate of the registered population resembles the spatial distribution of the economic regions (Qi et al., 2017). A potential reason behind this is that individuals with non-agricultural Hukou status who live in the most developed areas are subject to more well-developed educational policies regarding the compulsory education at the local level and are thus more likely to comply with the law. The strongest effect on female education attainment is found in the “most developed area” group, which is an increase of 1.01 years (Table 8, Column 3). The group also yields larger effects in infant low birthweight (4 percentage points), infant sickness (-2.19 times), and spouse education (2.35 years) compared to the full-sample average effects. This further indicates the impact of education in generating positive spillover health effects toward the next generation. In the analysis by Hukou status, both non-agricultural and agricultural groups experience comparable increase in educational attainment as in the full sample. Females with non-agricultural Hukou status (Table 8, Column 2) also have a significant increase in log annual income (38%) and delayed the timing of childbearing by 2.09 years. These lead to more salient effects in infant health, with a 3.56 decrease in infant number of sickness. Compared to results for females with agricultural Hukou origin, these coefficients potentially infer an augmented health effects coming from higher income, ceteris paribus.Footnote 40 The third interesting finding is that for modestly- and less-developed areas, as there is limited and insignificant effect found for the first stage or for the impact on completed schooling, we also do not see the noticeable effect in the other consequential outcomes. To some extent, these results may be viewed as a placebo test of the causal estimation. In all, the heterogeneity analysis indicates that the strongest intergenerational health effects are found for the non-agricultural Hukou origin group and the most developed areas group. On top of strong compliance rates, these two groups also have increased income or higher completed years of schooling (for both females and their spouse) separately, which contribute to the better impact toward reduced infant low birthweight and number of sickness. The results may shed light on the future work for policy implementation on enforcing the CEL in comparatively less-developed areas, so that individuals may benefit from education in the long run.

Robustness Check

The robustness of the RD results is checked by narrowing down the bandwidths. Originally, the full sample covers through the entire calendar year. Restricting the bandwidth to the months from May to December represents four dots on each side of the cutoff. The bandwidth from July to October covers two dots on each side. Table 6 to Table 7 each present the robustness of the RD reduced form estimates to the variation in bandwidths. Comparing with the full-sample results, none of the narrowed-down samples return largely different results for all the outcomes in terms of magnitude and significance level. Thus, I can safely conclude that the RD results are robust.

The other important aspect to consider for the robustness is the test of the assumption of the RD design. Individuals on either side of the cutoff are assumed to be similar except for the treatment status. Hence, the manipulation test of the running variable and the covariates tests are performed to examine the assumptions. Figure 6 shows the manipulation test for the running variable birth month. Although there seem to be more females born right after the cutoff month for the sample born after the year 1980, this is not proved to be a manipulation, as the birth pattern is the same for the sample born before 1980.Footnote 41 Figure 7 gives the covariates test for predetermined or time-invariant characteristics of individuals in their birth year, ethnicity, Hukou status, and parental education levels. There is little evidence of discontinuity over the cutoff for the covariates, which means that females born around the cutoff are essentially similar to each other.Footnote 42

Fig. 6
figure 6

Manipulation Test, CFPS. This figure shows the density of individuals born in each month of the calendar year. The left panel uses individuals born before 1980, and the right panel shows the density distribution of those born after 1980. The data are from the female sample in CFPS, wave 2010 to 2016

Fig. 7
figure 7

Covariates Test, CFPS. This figure shows the covariates or smoothness test of predetermined individual characteristics. Means, linear fitted values, and 95% confidence intervals are presented for each plot. Discontinuities and standard errors are displayed on the lower left corner of each plot. Data are from CFPS, wave 2010 to 2016

Discussion

Absolute Age Effect and Relative Age Effect

The causal mechanism of school entrance age on education outcomes can work through absolute age effect, relative age effect, or both. Absolute age effect is a birth date effect that measures an individual’s school readiness and maturity based on her own age. Relative age effect measures her age relative to the cohort based on the selection period. Older students may outperform their younger peers academically. For policymakers, changes in the education policy to institute a cutoff requirement will influence the absolute age of all individuals and the relative age of some individuals. A few literatures have distinguished the absolute and relative age effect and analyzed the causal effect on academic outcomes, such as test scores, grade retention, and adulthood outcomes (Elder & Lubotsky, 2009; Peña 2017; Kivinen, 2018). For this study, as there is only one national cutoff date in the policy, I am not able to measure the relative age effect using a difference in cutoffs within the same calendar year.Footnote 43 However, absolute age effect is likely measurable, as some provinces with limited resources are allowed to postpone the school entry age to 7 years old instead of 6 years old. One possible way to explore this question is to distinguish the provinces with different age requirements and extract the differences in the outcomes between the two groups. And this is the future step of the study.

Estimation from China Labor-Force Dynamics Survey

The main estimation results in this paper are additionally backed up by estimations using a different dataset—China Labor-force Dynamics Survey (CLDS). It is a nationally representative biennial cross-sectional survey launched by Sun Yat-Sen University. National surveys were conducted in 2012, 2014, and 2016. In this study, data from both 2012 and 2014 are used. Individual-level surveys cover key variables and information on individual birth month, educational attainment, and fertility outcomes. However, data on infant health are not available. As presented in appendix Table 12, linear regression discontinuity returns a significant discontinuity of 0.79 in completed years of schooling for females older than 23 years old, after controlling for the covariates. There is no evidence of a jump at the cutoff for younger females. This result is consistent with that from CFPS. Furthermore, appendix Table 13 presents estimations on fertility outcomes. Whether for mothers of all age group or mothers older than 23, the school entry policy induces no significant effect on probability of motherhood, age at first birth, or mothers’ ideal number of children. The sign and significance of the estimates fit the RD estimation with CFPS as well, which supports the validity of my results.

Pre-Policy Estimation

Pre-policy estimations work as the support for a policy effectiveness. The assumption would be that using the same RD method, there is no before effect. I performed analysis using the sample of females born prior to 1981. Figure 8 shows that for neither the first stage nor the reduced form estimate on years of schooling, is there a discontinuity around the threshold. Namely, the birth months of individuals predicted different school entry ages and had causal effect on educational attainment only after the CEL was taken effect.

Fig. 8
figure 8

Pre-policy RD Estimations, CFPS. This figure shows the pre-policy test of the first stage and reduced form effect on completed years of schooling. Means, linear fitted values, and 95% confidence intervals are presented for each plot. Discontinuities and standard errors are displayed on the lower left corner of each plot. Data are from CFPS, wave 2010 to 2016. Sample is selected by individuals with the birth year earlier than 1980

Conclusion

This paper explores the causal effect of female education on fertility outcomes and infant health through two methods, providing the LATE with variations around the policy cutoff and the LATE for the compliers across the calendar year. Estimates using these two different methods are close to each other in both magnitude and significance level. Overall, I find that (1) the entry cutoff date creates 0.21-year difference in age at primary school entry. Females born after the cutoff enter school at a later age. (2) The reduced form effect on years of schooling is only significant for females or mothers older than 23 years old, which, according to the estimate, means that the policy induces 0.59 more completed years of schooling on average. It also means that CEL does not disrupt the short-term education decisions for younger females. (3) In either fuzzy RD or IV, the results indicate that there is a limited role of female education on fertility decisions, such as probability of motherhood, maternal age at first birth, or labor market outcomes. (4) For infant health, the educational policy is beneficial as it reduces infant low birthweight by about 3.5-percentage points and reduces infant number of sickness by 1.52 times. (5) In assortative matching, higher educated females are also shown to be pairing with higher educated males with 1.15 more years of education. (6) In heterogeneity analysis by-area and by Hukou status, the effects are most salient for individuals from most developed areas or with non-agricultural origin. The larger results potentially come from the higher educational attainment in the most developed area or the augmented income channel for females with non-agricultural Hukou. The two other by-area groups that do not experience the intergenerational spillover effects also do not have strong or significant first stage and effect on female education. This alternatively can be viewed as a placebo test showing that the health effect does not exist when there is not enough evidence of the variations of school entry age or female education. (7) RD is robust to the changing bandwidth and passes the specification checks.

The paper is an addition to the existing literature insofar as it is the first quasi-experimental study on this topic in the context of China, and it uses two different approaches. The conclusion could provide some insights regarding female education and human capital for policy makers and parents. When females enter school at a later age it leads to more educational attainment, which suggests that starting school when individuals are intellectually more mature will be beneficial in the long term. Furthermore, this study shows evident decrease in infant low birthweight and infant illness. The reasons behind could be the increase in either health knowledge or income through education. Recently, there have been debates about policy changes regarding the strict cutoff on the CEL of China, for which policymakers have made more flexible.Footnote 44 In light of this, school starting age is becoming the vital decision that parents need to make, for both the short- and long-run gain of their children. Additionally, as there is no intergenerational health effect found in modestly-developed and less-developed areas where there is limited compliance rate or change in educational attainment, it is also suggested that female education in these areas should be strengthened and be the area of focus in future policies. Last but not least, as China is introducing two- and three-child policy as recent as May 2021 because of the aging population, the paper will also shed light on the potential policies in the near future to improve female education and the new-born’s health, which will be beneficial to the society as a whole.