1 Introduction

Countries worldwide have devoted much effort and resources to family planning programs (Bongaarts 2009). Most of these programs have been voluntary, but some have left little choice to parents, such as China’s one-child program and India’s sterilization camps. A major assumption underlying these programs is that “a small family is a happy family”,Footnote 1 or that a reduction in family size enables families to raise investments in human capital per child, leading in its turn to a stronger economy (Bongaarts 2009). Intuitively, the assumed causality between small family sizes and high schooling attainments makes sense, dividing scarce resources among fewer children, and leaves each child with more resources.

This intuition found support in well-known social science research. Blake (1989), studying US families, famously concluded that children from one- and two-child families are better educated and more successful than children in larger families because their parents have more time and money to invest in them. This “resource dilution model” was backed up by economic theory, in particular in a pioneering paper written by Becker (1960), in which the quantity and quality of children are modelled as substitutes from the parents’ point of view.

However, there also exist theories that support a positive causal effect of family size on children’s schooling. These theories break with Blake’s and Becker’s assumptions that children only imply a cost to parents, and that more children imply higher costs. As such, the quantity-quality trade-off needs no longer hold when allowing for part-time child work (Mueller 1984a; Marteleto and de Souza 2013), or when older children work to provide for the younger ones (the so-called ‘chain arrangement’), or when there are economies of scale in raising children, with children sharing clothes, text books, transport to school or knowledge and skills (Guo and Van Wey 1999; Rosenzweig and Zhang 2009; Steelman et al. 2002; Qian 2009). Economies of scale can also be present in household chores, such that the time each child spends on chores reduces with the number of siblings, thus freeing up time for school.

Despite the diversity of theoretical predictions, it is hard to move away from the idea of a negative causal relation. An important reason for its stickiness lies in the strong negative correlation between family size and children’s schooling, and the difficulty to empirically distinguish this correlation from the actual causal effect of family size on schooling. To make this distinction, one needs to purge the correlation of confounding factors. Most importantly, parents’ characteristics determine preferences for both the number of children and their years of schooling. For instance, mothers who enjoyed more years of schooling generally prefer smaller families and, at the same time, are likely to give more importance to their children’s schooling. Other confounding factors include wealth, social norms regarding fertility and child labour, labour market opportunities for adults and children and the availability and quality of old-age security schemes and education policies (Rosenzweig and Wolpin 1980; Angrist et al. 2010; Black et al. 2010). To the extent that these confounding factors are not perfectly observed and controlled for, the estimated relation between family size and children’s schooling is plagued by endogeneity issues.

In this paper, we aim to remedy this endogeneity problem in a sample of children from 208,729 households across 34 sub-Saharan African countries. In 3844 of these households, twins were born, causing a quasi-exogenous increase in household size. Provided controls for certain mother characteristics (Smits and Monden 2011; Bhalotra and Clarke 2016)Footnote 2 and for the health condition of twins (Rosenzweig and Zhang 2009),Footnote 3 twin birth can be used as a plausibly exogenous instrumental variable to isolate the causal effect of family size on the educational outcomes of children born prior to the twin birth (Angrist et al. 2010; Conley et al. 2012). The same does not apply for children born after the twins, because their birth can be the result of parental choice. Concretely, in our instrumental variable (IV) approach, we look at the outcomes of first-, first- and second-, and first-, second- and third-born children respectively in families of two or more, three or more and four or more children, using the birth of twins at the second birth, third birth and fourth order as the instrumental variable.

Our empirical investigation adds to the body of literature that has tried to empirically unearth the quantity-quality trade-off by relying on a number of techniques, such as tracing children’s intellectual abilities in a longitudinal analysis of families (Guo and Van Wey 1999) and exploiting the gradual roll-out and subsequent relaxation of family planning programs (Liu 2014; Qian 2009), a randomized controlled trial in family planning (Sinha 2005; Joshi and Schultz 2007) and instrumental variable approaches based on reported miscarriages (Maralani 2008), siblings’ sex composition (Angrist et al. 2010; Conley and Glauber 2006; Black et al. 2010; Fitzsimons and Malde 2014; Lee 2008) and twin birth (Rosenzweig and Wolpin 1980; Rosenzweig and Zhang 2009; Angrist et al. 2010; Black et al. 2005, 2010; Marteleto and de Souza 2012, 2013; Mogstad and Wiswall 2016; Bhalotra and Clarke 2016). These exercises in causal identification have not uniformly yielded negative estimates of the effect of sibship on schooling. Instead, the effect turns out to vary over time, across regions and subpopulations, across birth order and across the exact outcome of interest studied (e.g. private schooling, school enrolment, educational attainment or IQ).Footnote 4

None of these exercises in causal identification has however looked specifically at sub-Saharan African countries.Footnote 5 Our paper fills this gap. There are several reasons why sub-Saharan Africa (SSA) provides an interesting setting for such analysis. First, in SSA, the majority of households face tight budget constraints, schooling is barely compulsory and children’s participation in the labour market and in time-consuming household chores is socially still largely accepted (Bass 2004). Combined, these features make it very likely that a household’s decision to invest in children’s formal education involves important trade-offs. Second, in most African cultures, family members are bound to act for the benefit of the collective, be it the nuclear family or the extended family, the clan or ethnic group (Lloyd and Blanc 1996). Regarding the decision to invest in schooling, this implies that the benefits of schooling are expected to be shared—giving, for instance, way to the chain arrangement in which earlier-born children are sent to school and use their wage earnings to invest in their younger siblings later on, rendering a quantity-quality trade-off superfluous (Baland et al. 2016; Mueller 1984b). Third, in many of the least developed regions in SSA, the quality of schooling may be low (e.g. Chaudhury et al. 2006), or labour market opportunities may be lacking (e.g. Garcia and Fares 2008), both of which depress the returns to education. Hence, additional schooling may not be a way to invest in children’s quality. Finally, SSA still is the region with the highest fertility and lowest educational enrolments and attainments,Footnote 6 increasing the relevance of research on these issues.

Our use of Demographic and Health Survey (DHS) data comes with both pros and cons. Among the pros, we count the availability of demographic and health data of mothers, allowing us to explicitly control for factors such as ethnicity and mothers’ height and health that are likely to affect twinning. In doing so, we further purge this instrument of potential sources of endogeneity. Second, the detailed information on children’s health allows us to verify that parents do not allocate resources away from twins—who may suffer from poorer health at birth—towards older singleton-birth children, thus further providing confidence in our instrument. Third, the DHS data allows us to distinguish between three distinct proximate causes that underlie differences in educational attainment: school enrolment, school starting age and dropout. Among the cons, we face the constraint that, beyond the fourth birth order, there are insufficient observations on twin births in the DHS to provide enough power in the first stage. Our analysis therefore only focuses on the effect of family size on outcomes of children of the first, second and third birth order, and its findings cannot readily be generalized to siblings at higher birth orders (Booth and Kee 2009; Qian 2009; Mogstad and Wiswall 2016). Another point of attention is that the DHS focuses on mothers of childbearing age (15–49 years), such that the observed number of children may be below the eventual number of children and the reported level of schooling may be below the eventual schooling attainment. Consequently, the relation that we observe between sibship and schooling captures a snapshot in time of a process in motion, not its final stage. Furthermore, the DHS provides no systematic information on health for children above 5 years old, nor information on test scores or other proxies for the quality of schooling. We therefore limit our analysis to the quantity of schooling, measured by the number of years of schooling. Finally, as will be explained further on, we need to address the complexity of family structure in SSA, which includes polygamy and a non-negligible number of children living outside the household, often with extended family.

In the next section, we explain the empirical strategy. Then, we describe our data and present the results. In our results section, we find overall no evidence of a quantity-quality trade-off. In particular, we cannot reject the null hypothesis of no relation between family size and schooling in the subsamples of families with two or more, or four or more children, while we find a significantly positive effect of family size on schooling in the sample of families with three or more children.

2 Empirical strategy

In our empirical analysis, we use an age-standardized z-scoreFootnote 7 for the years of schooling as our main outcome variable. We consider three subsamples: firstborn from families with at least two children (2+ sample), first and second born from families with at least three children (3+ sample) and first, second and third born from families with at least four children (4+ sample). We first examine the relation between schooling and family size using ordinary least squares (OLS). Then, we respectively use twinning at the second, third and fourth birth to instrument the number of children in the ‘2+ sample’, ‘3+ sample’ and ‘4+ sample’. Focusing on the schooling outcomes of n − 1 children born prior to twins at the nth birth avoids selection problems that arise because “families who choose to have another child after a twin birth may differ from families who choose to have another child after a singleton birth” (Black et al. 2005). To ensure the validity of our instrument, we duly control for a battery of mother-level characteristics that may affect twinning and, at the same time, correlate with children’s schooling.

Concretely, the OLS specification takes the following form:

$$ {\mathrm{Education}}_{hm fi}={\beta}_0+{\beta}_1\ \mathrm{Number}\ {\mathrm{of}\ \mathrm{children}}_h+{\beta}_2{X}_{hm fi}+{\beta}_3{X}_{hm}+{\beta}_4{X}_{hf}+{\beta}_5{X}_h+{\varepsilon}_{hm fi} $$
(1)

where h indicates household, m mother, f father and i the individual child. Educationhmfi is equal to the child’s z-score. The Number of childrenh is a count variable that is equal to the total number of sons and daughters of the household head, residing in the household. Xhmfi is a vector of child-level characteristics that includes an indicator variable for sex, indicator variables for each age in the range 6 to 18, the child’s birth year and its month of birth. Xhm is the set of mother-level characteristics including her years of schooling, age, age squared, height, religion, ethnicity,Footnote 8 the total number of her children who have died and whether a child of her has died before its first birthdayFootnote 9. Xhf is the set of father-level characteristics comprising his age and years of schooling. Xh includes household’s residence area (urban/rural) and wealth quintile.Footnote 10 To account for a within-household correlation of the residuals, we cluster all error terms (εhmfi) at the DHS cluster level, which equate villages in rural areas and city blocks in urban areas.Footnote 11

The second stage of the IV specification is captured by the following equation:

$$ {\mathrm{Education}}_{hm fi}={\delta}_0+{\delta}_1\ {\widehat{\mathrm{Number}\ \mathrm{of}\ \mathrm{children}}}_h+{\delta}_2{X}_{hm fi}+{\delta}_3{X}_{hm}+{\delta}_4{X}_{hf}+{\delta}_5{X}_h+{\omega}_{hm fi} $$
(2)

in which family size is instrumented in the first-stage equation:

$$ \mathrm{Number}\ {\mathrm{of}\ \mathrm{children}}_h={\alpha}_0+{\alpha}_1\ {\mathrm{Twin}}_h+{\alpha}_2{X}_{hm fi}+{\alpha}_3{X}_{hm}+{\alpha}_4{X}_{hf}+{\alpha}_5{X}_h+{\vartheta}_h $$
(3)

In the n+ sample, the indicator variable Twinh is equal to 1 if the nth birth is a multiple birth and 0 otherwise.Footnote 12 The X vectors include control variables as previously defined. εhmfi, ϑh and ωhmfi are the error terms.

In our results section, we scrutinize the exclusion restriction of our instrument, relying on insights of Altonji et al. (2005), Conley et al. (2012) and Bhalotra and Clarke (2016). In a series of nine robustness checks (cf. infra), we modify this empirical framework in several ways, using alternative samples, changing the decision unit (from household head to wife/wives) and modifying the definition of our key variables.

In all cases, we report heteroscedasticity-robust statistics and the usual post-estimation tests (under-identification test, weak identification test and overidentification test).

3 Data

In our empirical analysis, we rely on all DHS rounds implemented in sub-Saharan African countries in the period 1990–2014, for which we could construct the main variables.Footnote 13 In our baseline approach, we consider 59 survey rounds for which information on the ethnicity of the mother is available (Appendix Table 11 gives an overview of these survey rounds by country and year) and restrict the sample to children whose siblings of schooling age (6–18) all reside within the household.Footnote 14 This gives a dataset of 456,068 siblings of schooling age (6–18), to which we will refer as ‘sample I’.

We focus on the educational attainments of 64,339 firstborn (2+ sample), 99,875 first- and second-born children (3+ sample) and 101,848 first-, second- and third-born children (4+ sample) in the age group 6 to 18, who live with their parents. The lower limit of 6 is the age at which many children in SSA start primary schooling. The upper limit of 18 is the age of secondary schooling completion, provided a swift grade progression. We do not extend the upper limit beyond 18 because post-secondary education in SSA still faces important supply side constraints, and because a considerable proportion of children above 18 live outside the household such that their schooling attainment goes unrecorded in the DHS rounds.

To capture the educational attainment of these children, we look at their completed years of schooling at the time of the survey and then construct age-standardized education z-score with children of the same country and birth cohort as reference group. Our explanatory variable of interest is the number of children in a household. In our baseline specification, we define this variable as the total number of the household head’s sons and daughters residing in the household.

Our instrumental variables are the birth of twins at the second, third and fourth birth order. In the DHS birth records, there is a specific variable indicating whether a child is part of a twinship or not. To determine birth order, we consider all children of the household head including those who do not have their mother in the household.

The summary statistics of the principal variables in our analysis can be consulted in Table 1. The summary statistics of the other variables can be consulted in Table 12 of the Appendix.

Table 1 Sample means and proportions of main variables

4 Baseline results

4.1 Baseline OLS and IV estimation

The estimation of the OLS model (Eq. (1)) yields a negative and significant relation between family size and children’s schooling. As shown in Table 2, the estimated coefficient is rather small in magnitude: one additional child is associated with a reduction of about 0.023 units of the z-score which corresponds to 0.057 years of schooling for a child of 10 years of age.Footnote 15 As this result does not isolate a causal link between our variables of interest, we turn to our IV estimations.

Table 2 OLS estimates

The estimates of the first stage (Eq. (3)) are shown in the second panel of Table 3. Unsurprisingly, they indicate that twinning increases average family size. The effect of twinning ranges from an additional family size of 0.356 (in the 4+ sample) to 0.509 (in the 3+ sample). The coefficients are precisely estimated, significant at 1% and similar to the range of coefficients found in previous research.Footnote 16 In all cases, the twin instrument has reassuring first-stage post-estimation statistics, with the Cragg-Donald Wald F statistic well above 100.

Table 3 IV estimates of the effect of family size on children schooling

In contrast to the OLS estimates that are uniformly negative in all three subsamples, the second-stage IV estimates (first panel of Table 3) indicate either no impact (in the 2+ sample and 4+ sample) or a positive and significant effect (at 5%) of family size on education z-scores (in the 3+ sample). In the 3+ sample, a 1-unit increase in predicted family size increases the z-score by 0.097 units on average which is the equivalent of 0.240 years of schooling for a child of 10 years of age. We demonstrate the robustness of these IV results in Section 5 and tentatively explore plausible explanations in Section 6.

4.2 On instrument validity

A concern when implementing the IV estimation is the violation of the exclusion restriction. The exclusion restriction may be threatened because of the presence of confounding factors that affect both twinning and children’s schooling. In particular, besides a mother’s age, ethnicity and height, also less easily measurable characteristics such as her general health condition may affect the probability of twinning (Smits and Monden 2011). In theory, a mother’s health behaviour, in particular smoking and multiple-birth-enhancing fertility treatments, could pose another threat to our instrument’s validity, but in practice, this threat is neutralized in SSA due to its prohibitively high (social and monetary) cost (Inhorn 2003).

Table 4 explores the determinants of twinning in our sample. The first column shows the results of a regression of twinning on the mother-level characteristics included in our baseline model: her years of schooling, age, age squared, height, ethnicity, religion, the total number of her children who have died and a dummy variable capturing whether these children died before their first birthday. Out of these eight characteristics, only mother’s religion does not significantly affect the probability of twinning at third birth. However, the overall explanatory power of the regression is very low (R2 of 0.009). The second column adds four additional mother-level regressors: mothers’ body mass index (BMI), access to prenatal health care, access to a doctor and access to a nurse.Footnote 17 All four additional regressors turn up significant while mother’s education becomes non-significant.

Table 4 Determinants of twinning

Based on subsamples of individuals for which all additional four control variables are available, Table 5 shows how the estimated coefficient of interest changes when adding the additional four control variables to our baseline controls. Using the baseline specification, the estimates are 0.003 in the 2+ sample (panel A, column I), 0.137** in the 3+ sample (panel B, column I) and − 0.043 in the 4+ sample (panel C, column I). When adding the four additional mother-level controls, our coefficient of interest only slightly decreases in all three panels, to − 0.005, 0.126** and − 0.061, respectively. Thus, without the additional controls, our estimates are (slightly) biased upward, which is expected if the included mother characteristics positively correlate with both twinning and children’s schooling. In the spirit of Altonji et al. (2005), we however argue that, if observed additional controls change the value of our estimated coefficients only to such a small extent, it is unlikely that there exist unobserved controls that would turn our results upside down.Footnote 18

Table 5 Stability of the estimated coefficient of interest and post-estimation statistics when expanding the set of mother-level control variables

To further safeguard our results, we follow Conley et al. (2012) and Bhalotra and Clarke (2016) in deriving bounds for our coefficient of interest using the ‘plausexog’ command in Stata. To do so, we first acquire insight into the direct effect of twinning on children’s schooling (γ) by simply comparing education z-scores of children from twin mothers to those of children from non-twin mothers, controlling for the set variables in our baseline specification (see Table 13 in the Appendix). We then take the upper value of the estimated 95% confidence interval which is 0.005.Footnote 19 The standard deviation of γ (sd = 0.007) is obtained from a 100-replication bootstrap estimation of γ. Table 6 shows our bounds’ estimates of the family size effect on children’s schooling using Conley’s union of confidence interval (UCI) and local to zero (LTZ) approaches. The results point to a rejection of a quantity-quality trade-off in all three samples and a confirmation of the positive effect of family size on children schooling in the 3+ sample.

Table 6 Bounds’ estimates of family size effect on children schooling using Conley’s UCI and LTZ approaches

Another potential violation of the exclusion restriction stems from parental behaviour towards twins due to twins’ lower average birth weight and the potential consequences for their health or cognitive achievements later on (Rosenzweig and Zhang 2009). If the future earning potential of twins is thought to be lower, parents may divert resources from twins to singletons. Rosenzweig and Zhang (2009) suggest to include twins in the analytical sample and to include birth weight in the regression in order to control for this potential bias. This could indeed be a straightforward solution, where it is not that the DHS only includes birth weight for the under-5-year-old children. As a second best, we verify whether this ‘diversion of resources’ hypothesis is supported in the sample of under-5-year-old children.

Using information on 70,902 under-5-year-old children in our 59 DHS waves, we regress birth weight (in log terms) on a twin indicator and find that twins indeed have a significantly lower average birth weight than singletons (see column I of Table 14 in the Appendix).Footnote 20 When looking at the weight (column II) and body mass index percentile (column III) of under-5-year-old children at the time of the survey, we still find that twins have significantly lower weight and BMI than singletons.Footnote 21 However, the estimated coefficient is smaller, indicating that the gap has become smaller over time. In fact, when looking at the BMI, twins and singletons belong to the same decile on average.Footnote 22 The closing of the gap suggests that parents do not divert resources away from twins. Furthermore, controlling for the entire set of age dummies, wealth quintiles, region of residence, child sex, parental education and ethnicity fixed effect, we find that twins enjoy as much education as singletons (column IV of Table 14 in the Appendix).

4.3 Heterogeneity

As mentioned in the introduction, the studies that have set out to identify the causal relationship between family size and schooling have produced mixed results, suggesting that the relation is context dependent. Hence, we explore the heterogeneity of our result, by running separate regressions across subsamples with respect to gender, poverty status and region.

To explore regional variation in the estimated relation, we contrast West and Central Africa with East and Southern Africa. This division is inspired by the regional clustering of TFR, which is, on average, relatively high in West and Central Africa (5.09) and lower in East and Southern Africa (4.45). To compare across poor and non-poor, we define households in the first and the second asset ownership quintiles as poor and those in the fourth and the fifth quintiles as non-poor. We discard the third quintile to achieve a sharper contrast between poor and non-poor. The results are lined up in Table 7. In our discussion below, we highlight the various significant coefficient estimates.

Table 7 Subsample analyses with respect to gender, asset wealth and region

When only looking at the schooling of boys, we find a sizeable and significantly positive coefficient estimate in the 3+ sample (0.122**). For girls, the estimated coefficient is found to be negative and (slightly) significant (− 0.183*) in the 4+ sample. In the subsample of poor households, the estimated coefficient is insignificant across the board. For the subsample of non-poor, we observe a significantly positive estimate in the 3+ sample (0.139***). In the regional subsamples, the estimated family size coefficient appears to be only slightly significant (and positive) in the 3+ sample in West and Central Africa (0.108*). It remains non-significant elsewhere.

When comparing the effect of family size on schooling across countries with persistently high fertilityFootnote 23 and countries with either a low fertility or a downward fertility trendFootnote 24, we find a non-significant effect in low-fertility countries in all the 2+, 3+ and 4+ samples, but a positive and slightly significant (of 0.128*) effect in the 3+ sample of high-fertility countries.

Finally, we explore whether the relation between family size and schooling has changed over time. To do so, we focus on the year 2000, which marked the adoption of the Education For All initiative by 189 countries.Footnote 25 We compare the effect of family size on schooling across children born prior to 2000 and children born from 2000 onwards. In none of the samples, we find a differential effect of family size over time, but our results indicate that the association of parents’ education, gender and residence area with children’s educational outcomes has weakened considerably over time, suggesting a democratization of schooling (see Table 8).

Table 8 Effect of family size across older and younger cohorts

In sum, in our various subsample analyses, the zero result is confirmed for the 2+ sample and the 4+ sample (with the notable exception of the subsample of girls in the 4+ sampleFootnote 26), while the positive result for the 3+ sample is shown to be mainly driven by children from non-poor families, living in high-fertility countries. We will come back to this latter result in Section 6. Now, we first discuss a series of robustness checks.

5 Robustness checks

We check the robustness of our results in ten different ways. Table 9 gives a line-up of the estimated coefficients on family size for each check in the 2+, 3+ and 4+ samples. The full results are reported in Tables 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 of the Appendix, for the total sample, as well as for the poor and non-poor subsamples.

Table 9 Summary of robustness checks

In the first robustness check, we follow Angrist et al. (2010) in allowing for heterogeneity across subsamples in the predictive power of twin birth in the first stage. We do so by including interaction terms in our first-stage regressions between twin birth and a set of indicator variables (rural, mother is Muslim, rural West and Central Africa), thus sequentially adding the following regressors: Twinh × Ruralh, Twinh × Mother _ Islamm and Twinh × Rural West and Central Africah. The rationale for including these regressors is that fertility is higher in rural areas (compared to urban areas), in Muslim families and in West and Central Africa (compared to East and Southern Africa), and the twin instrument tends to perform less well in larger families (Angrist et al. 2010).Footnote 27 When including all three interaction terms, the zero result remains in the 2+ and 4+ samples, while we still find a positive and significant in the 3+ sample.

In the second robustness check, we use an unrestricted specification with the ‘partially missing instruments’ method as described in Mogstad and Wiswall (2016), pp. 174–175). The partially missing instruments are constructed based on a polynomial function of mother’s age, mother’s education, father’s age and father’s education, controlling for child characteristics, household characteristics, mother’s ethnicity and her health conditions. This change in specification does not alter our results: family size remains insignificant in the 2+ and 4+ samples and significantly positive in the 3+ sample, in particular when captured by the indicator variable ‘more than 3 children’ that apprehends the marginal effect of moving from three to four children.

Third, instead of restricting the sample to those children that are part of households where all children reside within the household, we expand the sample to include also children that live in households where one or more school-aged siblings reside outside the household. This approach yields a 2+ sample of 68,259 children, a 3+ sample of 112,285 children and a 4+ sample of 119,544 children. We find that the coefficient on family size is still positive and slightly significant in the 3+ sample and positive and non-significant in the 2+ sample but turns negative and slightly significant in the 4+ sample.

Fourth, recognizing the complexity of SSA households, we change the decision unit. Among the 525,646 children in our sample I, we count 165,418 living in polygamous households. While our baseline approach considers the household head as the unit of decision making, in this robustness check, we assume decisions to be taken at the level of each mother. In this decentralized approach, the number of children is defined for each of the household head’s wives as her total number of sons and daughters living in the household. Birth order is also defined at the level of the mothers. Doing so, we find that family size loses its significance in the 3+ sample and remains so in the 2+ and 4+ samples.

In checks 5, 6 and 7, we use alternative definitions of our key variables. Instead of defining the number of children as the sons and daughters of the household head, we define them as the total number of births given by the household head’s wives (unless the household head is female). To reduce measurement error in our schooling variable (for instance, parents reporting years in kindergarten as schooling), education z-scores are based on censored years of education.Footnote 28 And, estimates using completed grades (years of schooling) rather than educational z-scores are provided in robustness check 7. Our findings remain similar in all three cases: no significant effect in the 2+ and 4+ samples and a positive and significant effect in the 3+ sample.

In robustness checks 8 and 9, we use region of residence of the household and country-by-urban/rural fixed effects instead of mother’s ethnicity fixed effects to take into account DHS rounds in which ethnicity is not included (e.g. Rwanda, Burundi).Footnote 29 In the former case, the estimated coefficient loses significance in the 3+ sample. Apart from that, the results remain qualitatively similar: a positive and significant coefficient in the 3+ sample and non-significant ones in the 2+ sample and the 4+ sample.

Finally, we use an alternative definition to discriminate between poor and non-poor families, defining the poverty line as the average value of the wealth index in each DHS round. This last approach confirms the positive effect observed only in non-poor families of the 3+ sample and the absence of effect in both poor and non-poor families across the 2+ and 4+ samples.

Overall, our results remain fairly robust in all three subsamples. The zero result in the 2+ sample remains so across all ten robustness checks while it becomes significantly negative in the 4+ sample in only one case (inclusion of households in which some school-aged children reside outside the household). In the 3+ sample, the positive coefficient is no longer significant in only two cases (decentralized approach and region of residence of the household instead of mother’s ethnicity fixed effect). When running the robustness checks on the non-poor sample, the estimated coefficient on family size in the 3+ sample remains positive and significant across the board.Footnote 30

Taken together, these results bolster the case against a quantity-quality trade-off in SSA, when quality is measured as educational attainment. At the same time, however, we note the heterogeneity of coefficient estimates, not only across gender, asset wealth and region but also across the 2+, 3+ and 4+ samples. Whether or not this heterogeneity is a statistical artefact needs to be determined by future work.

6 Mechanisms

To further guide future work, we fully exploit the DHS data, in order to provide some cues for the possible mechanisms underlying our results and their heterogeneity.

First, we use the DHS data to distinguish between three proximate causes that underlie differences in educational attainment: school enrolment, school starting age and dropout. We explore these proximate causes in all three analytical samples, by estimating Eqs. (2) and (3) with school enrolment, school starting age and dropout as dependent variables. Table 10 lines up the coefficients of interest. The full results are given in Tables 25, 26, 27, 28, 29, 30, 31 of the Appendix.

Table 10 Exploring the proximate cause underlying the positive effect of family size on schooling

When using the total sample and the subsample of poor households, we find a zero effect of family size on enrolment, dropout and school starting age in the 2+, 3+ and 4+ samples, with one exception (in the total 3+ sample, the dropout of second born is slightly reduced). For the subsample of non-poor households, we find various significant coefficients. In the 2+ sample, firstborn’s school starting age is reduced (− 0.499*) with an exogenous increase in family size. In the 3+ sample, they are second born that seem to start school earlier on. A closer examination of this effect reveals that it is largest and significant when the second born is relatively close in age with the firstborn (3 years apart or lessFootnote 31) (see column III of Table 10). In the 4+ sample, we find that family size reduces the probability of enrolment of the second born (− 0.072*). For the third born, results show a reduction in the probability of dropout (− 0.076*).Footnote 32

Overall, this tentative exploration of the proximate causes suggests that, in response to an exogenous increase in family size at birth order n, relatively small and wealthy households tend to send the n − 1th child earlier to school, a finding that is not replicated in poor households. This could indicate that, when financially possible, some households opt to speed the schooling of earlier-born children upon twin birth. Whether this finding can be broadly replicated, and whether it is explained by an attempt on the part of parents to maximize economies of scaleFootnote 33 or simply to relieve the caregiver so he/she can focus on younger siblings, is a question left for future research. The non-linearity of economies of scale (see e.g. Holmes and Tiefenthaler 1997; Tiefenthaler 1997) together with the negative effect of reduced care time on children cognitive abilities (Lehmann et al. 2018) may account for the heterogeneity across the 2+, 3+ and 4+ samples.

Next to economies of scale, the introduction of the article mentioned three other mechanisms that could explain the absence of a quantity-quality trade-off or even a positive effect of family size on schooling: child labour, the chain arrangement and support from the extended family. While we do not have the data to explore the likelihood of the latter two mechanism, we tentatively discuss (and dismiss) the role of child labour.

Child labour, both at home and in the labour market,Footnote 34 still is common in many sub-Saharan African countries, but the group of children that are working is increasingly made up of children who combine part-time employment and schooling (Guarcello et al. 2015). The combination of work and schooling may allow for child labour to contribute to schooling rather than crowd it out, by providing resources for schooling fees, their own as well as those of their siblings. Should child labour explain the positive effect of family size in the 3+ sample, we would however expect the effect to be larger in poor households and lower in non-poor ones, given that the latter rely less on resources provided by children. Instead, we find the reverse. To test more formally for the child labour mechanism, we exploit the available child labour information in a subset of the DHS rounds.Footnote 35 If child labour contributes to schooling and explains the positive effect (or zero effect) of family size, the estimated coefficient on family size would be reduced after controlling for child labour in our model. Results in Table 32 of the Appendix show that, rather than attenuating the positive effect of family size, the inclusion of child labour (both own labour and the labour of his/her siblings) slightly reinforces the positive effect in the 3+ sample while it leaves the estimated coefficients in the 2+ sample and the 4+ sample almost unaltered.

7 Conclusion

The aim of this study was to test the quantity-quality trade-off in SSA, focusing on children’s schooling. To do so, we investigated how an increase in family size affects schooling using twin birth as an instrumental variable to deal with endogeneity issues. Overall, we find no significant effect of family size on children’s schooling, thus casting doubt on the generally assumed negative relation between family size and schooling.

In the subsample of first and second born from relatively rich households with three or more children, we find a positive effect of family size on schooling and this effect survives various robustness checks. To exclude that this result is a statistical artifact, its replication in other samples is required.

In a tentative exploration of the underlying mechanisms, it emerges that upon a fertility shock at the nth birth, relatively small and rich families tend to send their n − 1th born to school relatively early on. Doing so may optimize economies of scale in schooling and/or relieve the caregiver, although both the replicability of the finding and its explanation need further study.

Exploring the heterogeneity of our results, we find that the effect of family size does not vary substantially across time. This in sharp contrast to the role of other factors such as parental education, gender and residence area (urban/rural) that significantly decreased over time, in line with the ‘democratization’ of education. Only in the 3+ sample we find a regional difference: family size positively impacts children’s schooling in countries with persistently high fertility while we find no effect in countries with low or declining fertility.

Our research suffers from a number of limitations. First, the positive impact of family size on schooling of first and second born children in the 3+ sample cannot readily be generalized to higher-parity children, which is clear from our results in the 4+ sample. Second, the DHS provides only a snapshot in time of children whose mothers are of childbearing age (15–49 years). The number of children observed as well as their schooling attainment reflect therefore only an intermediate situation, not the final one. This leaves open the possibility that, in the longer run, the positive effect of family size on schooling in non-poor households with three or more children (driven by early enrolment of the second born) fades away. Third, the available data are not well suited to distinguish between competing underlying mechanisms to the differential effect of family size across samples. The short-term horizon does not allow for explicitly testing of the chain arrangement. And, lacking more detailed data on household consumption and transfers, we cannot thoroughly test for the economies of scale and extended family mechanisms. Finally, the number of years of schooling is only one way in which parents can invest in their children. Important omitted dimensions include the quality of schooling and health care.

These gaps should be filled by future research, relying on other types of data such as pooled census data in which families and their split-offs are traced over time and micro-economic surveys that provide detailed information on household members’ consumption of schooling inputs and their time allocation, as well as surveys that include more information on health and the quality of schooling of a children in various age cohorts. A more open line of questioning, in qualitative research, could also reveal the reasoning underlying parent’s decision making.