Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This chapter presents a quantitative summary of research with regard to the effects of school size on student achievement and noncognitive outcomes (such as involvement, participation, social cohesion, safety, attendance, etc.). The noncognitive outcomes are widely considered as desirable in it, but are also often assumed to be conducive to high academic performance.

4.1 General Approach

The approach applied in this chapter yields an overall estimate of expected outcomes at a given school size. As such the approach can be considered a type of meta-analysis. However, common meta-analysis methods cannot be applied when dealing with research on the effects of school size. The main reason for this is that the relation between school size and outcomes is not always modeled as a clear-cut difference between small and large schools or as a straightforward linear relationship in studies that treat school size as a continuous variable.

Standard methods for conducting meta-analysis either assume a comparison between an experimental group and a comparison group or an effect measure that expresses a linear relationship. Outcomes from several studies are then standardized so that a weighted average effect can be computed (taking into account differences in sample sizes). The outcomes per study may be a standardized difference between groups (e.g., Cohen’s d) or a statistic that describes the linear relation between an explanatory and a dependent variable (e.g., Fisher Z). Both kinds of measures can be converted to a common metric.

Many different forms beside a straightforward linear relationship (i.e., the smaller/larger the better) are hypothesized and reported in research on effects of school size, e.g., quadratic and log-linear. In a considerable number of studies, several different size categories are compared. The reason for this is that researchers want to take into account the possibility that it may be more appropriate to look for an optimal school size rather than to estimate a linear relationship between size and outcomes. Such a linear relationship would imply that the best results occur if schools would be either as large as a possible (e.g., one school for an entire district) or as small as possible (e.g., single class schools).

With regard to school size research, providing a quantitative summary of research findings is therefore quite complicated. Often more than just two school size categories are compared. In addition, the categories used vary between studies. In other cases the relation between school size and outcomes is modeled as a mathematical function (mostly linear, log-linear, or quadratic). The findings from these studies are not only difficult to compare to those that relate to comparisons between different school categories, but also the distinct mathematical functions cannot be converted to a common metric. When the effect of school size is modeled as a quadratic function, two distinct coefficients must be estimated (linear and quadratic), which precludes by definition converting the findings to a single metric.

As standard meta-analysis methods are not suitable when it comes to drawing up a quantitative summary of the research findings, another approach will be used. Based on the findings reported in the reviewed studies the “predicted” outcomes given a certain school size are calculated. To achieve comparability of the results only the scores on the outcome variables have been standardized to z-scores. There is no need to standardize the explanatory variable as well, because studies only have been included that use the same operationalization for school size (i.e., total number of students enrolled). Standardization of both the explanatory and the dependent variable is often applied in meta-analysis when the focus is on the relationship between two numerical variables. Often this is the only option available to render findings from different studies comparable, as the operationalizations of both dependent and independent variables tend to vary across studies. In such cases standardized regression coefficients may be the “raw material” processed in the meta-analysis. In the present case standardizing the independent variable is not required, but standardization of the outcomes is unavoidable, as the raw scores are incomparable across studies. Whatever the outcome variable relates to (student achievement, involvement, safety), the operationalization is bound to differ from one study to the next. The approach applied here reports for specific school sizes the average standardized outcomes over a number of studies. More details on this method are provided below as we illustrate more specifically how the “predicted” outcomes have been calculated for a couple of studies.

A potential risk of the approach relates to samples with strongly diverging ranges on the explanatory variable. Suppose that one is dealing with two samples. In the first sample, the school size ranges from very small (single class schools) to a total enrolment of 500 students and the average school size is 250. The second sample consists of schools with enrolments ranging from 500 to 1,000 students and the average school size is 750. If the effect of school size on achievement is identical (e.g., achievement decreases one tenth of a standard deviation with a school size increase of 100 students enrolled), one would conclude that in both schools with 50 students and in schools with 550 students achievement is two tenths of a standard deviation above average. This interpretation might be correct, but it might also be mistaken. It is conceivable that the average achievement is much higher in the sample with smaller schools. In that case, the previous interpretation is clearly a mistake. It is therefore very important to be cautious in drawing conclusions from studies based on studies that vary strongly with regard to the ranges in school size. Note that similar risks apply to more commonly applied methods of meta-analysis.

4.2 Summarizing the Research Findings

Separate analyses are reported for student achievement and noncognitive outcomes. If an effect of school size for more than one measure of student achievement is reported (e.g., both language and mathematics), the average of these effects is reported in the summary. The same goes for noncognitive outcomes. In some studies, the effect of school size on a wide range of noncognitive outcomes (involvement, attendance, and safety) may be covered. Also in these cases the average effect is reported in the summary.

Findings will be reported separately for primary and secondary education. The main focus will be on the effect of school size on individual students. The key question addressed is to what extent student scores (cognitive or noncognitive) turn out to be relatively high or low given a certain school size. Student scores are standardized according to the well-known z-score transformation. First the mean is subtracted from each score and next the resulting difference is divided by the standard deviation. In another approach that is frequently applied, the result after subtraction is divided not by the standard deviation in student scores, but the standard deviation in school averages. The main argument for this approach is that school size, being a school level characteristic, can only have an impact on school means. Also when an analysis is based on data that are aggregated at the school level, it is hardly ever possible to estimate the effect of school size at the student level (unless information is available on the variation among student scores within schools). One highly important consequence of this approach is that it will inevitably yield larger estimates of school size effects. Only in the extreme situation where all variation in student scores is situated at the school level, (which would imply a complete absence of differences among students within schools) this approach yield the same estimate of a school size effect. However, as long as there is some variance between students within schools (which is always the case in real life), the school level variance (and therefore the standard deviation) is less than the total variance among students.

The argument outlined above may be illustrated with a simple example. Suppose that the standard deviation on a student achievement score equals 10 (therefore the variance is 100) and that the percentage of school level variance is 16.Footnote 1 This implies a variance equal to 16 and thus a standard deviation of 4 (square root of 16). Now let us assume that in large schools achievement scores are on average about 2 points lower than in small schools (for the moment, we will not deal with the question what counts as small and large school size). At the student level, this is a modest effect at best (one-fifth of a standard deviation), but if we compare the difference to the standard deviation among school means, the effect looks fairly impressive. In that case the difference between large and small schools equals half a standard deviation. Note that also such increases of the school size effect become even stronger as the percentage of school level variance decreases. In that case also the standard deviation among school means gets smaller, which will make the effect of school size appear to be larger. Especially for noncognitive outcomes differences between schools have often been reported to be quite modest.

In the authors’ opinion, the most appropriate basis for expressing the school size effect is the total amount of variation (i.e., the standard deviation) among student scores. This puts the impact of school size in the right perspective. The impact is limited because it only affects school means. Most of the variation in student scores (both cognitive and cognitive) is situated within schools. This variation cannot be affected by changes in school size unless school size interacts with a student level variable (e.g., some studies have reported that the effect of school size is relatively for socioeconomically disadvantaged students). This natural limitation of the impact by school level characteristics should be clearly expressed in an assessment of the effects of school size. However, findings that are standardized by means of the school level standard deviation will be reported as well. Otherwise a substantial part of the available research would be discarded.

If one study covers two or more distinct samples (e.g., primary school students and secondary school students; or samples from different countries or regions) the outcomes per sample will be treated separately when the findings are reviewed. Thus, it is possible that a single study contributes more than one result when summarizing the findings.

Findings on the effect of school size are included in the summary if they meet the following two preconditions. First of all sufficient information needs to be provided for calculating the “predicted” outcomes at a given school size. Some author report only unstandardized regression coefficients without providing information on mean and standard deviations of the outcome variables. In such cases it is impossible to determine what the standardized outcome will be according to the regression model. In other cases only standardized regression coefficients are reported. In such cases one needs information on the mean and standard deviation of the explanatory variable (i.e., school size) in order to determine what the standardized outcomes will be for a given school size. The second precondition is that only findings are included if prior achievement has been controlled for. This is the case if the analysis is based on growth scores or if student achievement has been controlled for prior achievement. Note that controlling for cognitive aptitude (e.g., IQ measures) has only been counted as measures of prior achievement if students took the test at an earlier point in time. Some studies did control for cognitive aptitude that was measured during the same period that the outcome measures were collected. Findings from these studies have not been included in the quantitative summary.

4.3 School Size and Student Achievement in Primary Education

Out of the total number of studies on school size reviewed, five relate to its effect on individual student achievement in primary education and also meet the preconditions specified above. All five studies were conducted in the United States. Basic details about these studies are provided in Table 4.1.

Table 4.1 Studies on the effect of school size on individual achievement scores in primary education

In the studies by Archibald (2006), Holas and Huston (2012), and Maerten-Rivera et al. (2010) the relation between school size and student achievement is modeled as a linear function. Of these only Archibald (2006) reports a significant (and negative) effect of school size for both reading and mathematics. Maerten-Rivera et al. also report a negative effect, but a nonsignificant one. Holas and Huston (2012) only report that their analyses failed to reveal a significant relationship. For the quantitative summary of the research findings, it is assumed that they found a zero relationship. In the studies by Lee and Loeb (2000) and by Ready and Lee (2007) different categories of schools are compared. Lee and Loeb (2000) distinguish three categories (less than 400 students; 400–750 students, and over 750 students). Ready, and Lee (2007) distinguish five categories (less than 275 students; 275–400 students; 400–600 students; 600–800 students, and over 800 students). Lee and Loeb (2000) report significantly lower performance in the medium category (400–750 students) in comparison to the small category. Ready and Lee (2007) report significantly lower performance in the large schools category (>800 students) in comparison to the medium category (400–600) for reading in the first grade. For mathematics in the first grade they report a significantly higher performance in the small schools category (<275 students) in comparison to the medium category. No significant effects of school size were found in Kindergarten.

By taking a closer look at the findings reported by Archibald (2006) their implications become apparent in more detail. The reported standardized regression coefficients equal −0.03 and −0.07 for reading and mathematics, respectively. As the mean and standard deviation for school size are reported as well (see Table 4.1), any school size can be transformed into a z-score. The z-scores corresponding with school size ranging from 150 to 850 are displayed in Table 4.2. After that one only needs to multiply the z-scores with either −0.03 or −0.07 to arrive at the predicted z-scores for reading or mathematics. Table 4.2 shows the details that the Archibald findings imply in a primary school with 800 students and the reading scores on average are 0.055 of standard deviation below average. For mathematics this will be 0.128 of a standard deviation. The table also reports the average results across both subjects.

Table 4.2 Predicted z-scores for reading and mathematics per school size; based on findings report by Archibald (2006)

Lee and Loeb (2000) report differences in mathematics achievement between various school size categories after controlling for numerous confounding variables including prior achievement. The differences reported are standardized by dividing through the standard deviation among school averages. As both within school and between school variances are reported (Lee and Loeb 2000, p. 18), it is possible to rescale the reported differences relative to the total standard deviation in student achievement scores. Lee and Loeb (2000, p. 21) report that the math scores are on average 0.073 of a standard deviation higher in small schools vs. medium schools (less than 400 students versus 400–750 students). The advantage of small over large schools is more modest (0.041 and statistically not significant). Given the information provided in Lee and Loeb (2000) and assuming that the standardized average score must be equal to zero, it is possible to compute for each school size category the “predicted” average. Table 4.3 report two types of standardized scores. First the scores standardized relative to the standard deviation among school means and next the scores standardized relative to the total standard deviation in math scores (i.e., taking into account variation within and between schools). The table shows that the highest scores were found in the smallest schools. However, the differences are clearly more modest when they standardized relative to the standard deviation based on variation both within and between schools. The findings clearly suggest a curvilinear relationship between school size and achievement. Based on the standardized averages per category, a quadratic function has been estimated. This approach has also been applied to the findings reported by Ready and Lee (2007) and further on to findings from other studies that focus on differences between three or more school size categories.

Table 4.3 Predicted z-scores for reading and mathematics per school size category; based on findings report by Lee and Loeb (2000)

Table 4.4 reports the main findings from all five studies on the school size effect in primary education based on student level findings. For each study, the predicted standardized achievement scores at student level are reported. All five studies report outcomes within the range from 200 to 850 students enrolled. For school sizes within this range a weighted average across all five studies has been calculated. Outcomes per study are weighted by the number of students.Footnote 2 Figure 4.1 provides a graphical display of the findings. On an average a slightly negative effect of school size on student achievement scores is detected. It should be noted, though, that the difference between student achievements in primary schools with 200 versus 850 students enrolled is still below one tenth of a standard deviation.

Table 4.4 Predicted student achievement (standardized) per school size in primary education
Fig. 4.1
figure 1

Predicted STUDENT achievement per school size (primary education). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

4.4 School Size and School Mean Achievement in Primary Education

It has already been mentioned that it also customary to standardize school size effects relative to the standard deviation among school means. This is the only option available when the analyses are based on aggregated school data. When multilevel analyses are conducted, it is possible to compute both types of standardized scores, provided that the necessary information on variance within and between schools on the outcome variable is reported. This is the case for three of the studies discussed in the previous section (Archibald 2006; Lee and Loeb 2000; Maerten-Rivera et al. 2010). See Table 4.5 for basic details on these studies. One additional study on school size and student achievement in primary education is included in Table 4.5 (Fernandez 2011). This study is based on aggregated school data. Like the other studies discussed so far, it relates to American schools (Nevada). The reported effect of school size on achievement is not significant and the standardized regression coefficient shows no noticeable deviation from zero. The study by Fernandez also includes high schools and middle schools, but the effects of school size are controlled for school type.

Table 4.5 Studies on the effect of school size on school average achievement scores in primary education

Appendix 4.1 presents the predicted standardized school means per school size for these studies. Figure 4.2 provides a graphical display of the findings. The figures in Appendix 4.1 also illustrate to what extent school size effects “increase” when the standardization is based on variation between school means. In the Archibald study the predicted standardized student scores range from 0.127 in schools with 200 students to −0.110 in schools with 800. The predicted standardized school means in the same study range from 0.296 to −0.257. Similar increases can be observed for the studies by Lee and Loeb (2000) and Maerten-Rivera et al. (2010). The impact of school size clearly appears to be more impressive if one compares the differences between large and small schools to the standard deviation of the school averages. Still, it is our opinion that the effects reported in Table 4.4 (i.e., impact on student scores) provide a more appropriate description of the impact of school size.

Fig. 4.2
figure 2

Predicted MEAN SCHOOL achievement per school size (primary education). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

4.5 School Size and Student Achievement in Secondary Education

Six studies have been found that related to the effect of school size on individual student achievement in secondary education and also meet our preconditions. Of these, five relate to secondary schools in the United States. The study by Ma and McIntyre (2005) deals with the situation in Canada (Alberta). Basic details are reported in Table 4.6. Except for the study by Ma and McIntyre (2005) the effect of school size is analyzed through comparison of different categories. However, there is little similarity in the categorizations applied. The number of categories range from 4 (Carolan 2012; Rumberger and Palardy 2005) to 8 (Lee and Smith 1997). See Table 4.6 for more details.

Table 4.6 Studies on the effect of school size on individual achievement scores in secondary education

Most of the studies included in Table 4.6 report differences in student achievement between school categories. In those cases, a quadratic function has been estimated to describe the relation between school size and student achievement. This function is based on the standardized averages per category. There are two exceptions. The first one is the study by Luyten (1994), which only reports that no significant differences between categories were found. The other exception is the study by Ma and McIntyre (2005). Here a linear relation between school size and achievement is estimated, but the authors only report a significant interaction effect of taking math courses with school size on the mathematics post-test (the effect of taking math courses is weaker in larger schools; in other words: students that take math course get higher scores if they attend smaller schools). No main effect for school size on math achievement is reported. For this review it is assumed that the main school size effect is not statistically significant in this study. No further details are reported and for the summary of the research findings it is assumed that both the study Luyten (1994) and by Ma and McIntyre found a zero relationship.

Appendix 4.2 reports the predicted standardized achievement scores per school size in secondary education for individual student achievement. Weighted averages for school sizes within the range from 400 to 1,900 students enrolled are presented as well. Note that the studies by Luyten (1994) and Ma and McIntyre (2005) do not fully cover this range. The zero effects that are reported in these studies are assumed to extend beyond the exact ranges covered in these studies. In contrast to primary education, the findings suggest a curvilinear relation between school size and student achievement. The lowest scores are found in small secondary schools (−0.050). In schools with enrolments ranging from 1,200 to 1,600, the scores are at least one-tenth of a standard deviation higher. When schools get larger, the predicted scores decrease somewhat. Figure 4.3 provides a graphical display of the findings.

Fig. 4.3
figure 3

Predicted STUDENT achievement per school size (secondary education). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

4.6 School Size and School Mean Achievement in Secondary Education

Appendix 4.3 reports the predicted standardized achievement scores per school size in secondary education for school mean achievement. For four out of the six studies included in Appendix 4.2, it was possible to calculate predicted standardized school means per school size. The study by Fernandez (2011), which makes use of aggregated school-level data (from the USA, Nevada) is included in Appendix 4.3. Again the findings reveal a curvilinear pattern, but now the lowest scores are found in the largest schools and the highest scores are found in schools with enrolments ranging from 900 to 1,250. This suggests a somewhat smaller optimum school size than suggested by the results based on individual achievement data. The findings from Appendix 4.3 are graphically displayed in Fig. 4.4.

Fig. 4.4
figure 4

Predicted MEAN SCHOOL achievement per school size (secondary education). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

4.7 School Size and Noncognitive Outcomes in Primary Education (Individual and School Means)

A wide range of outcome variables is subsumed under the label noncognitive outcomes. Still the number of studies on school size and noncognitive outcomes in primary education that report sufficient information to calculate the predicted outcomes per school size is quite limited, even though the requirements to be included in the quantitative summary are less stringent than for academic achievement. For studies on noncognitive outcomes controlling for prior achievement was not considered necessary. Inclusion of socioeconomic background as a covariate in the analyses was deemed sufficient.

For the summary relating both to individual outcomes and school means five distinct studies are available. Of these, one relates exclusively to the effect of school size on individual outcomes (Holas and Huston 2012), two relate exclusively to school means (Durán-Narucki 2008; Lee and Loeb 2000) and two relate to both levels (Bonnet et al. 2009; Koth et al. 2008). See Table 4.7 for an overview of the studies on school size and noncognitive outcomes in primary education.

Table 4.7 Studies on the effect of school size on individual and school mean noncognitive outcomes in primary education

Four of the five studies listed in Table 4.7 report on American research. The other one relates to research in the Netherlands. In three studies, the effect of school size is modeled as a linear function (Durán-Narucki 2008; Holas and Huston 2012; Koth et al. 2008). In the other two studies, three categories are compared (Bonnet et al. 2009: <300, 301–500, >500; Lee and Loeb 2000: <400, 400–750, >750). When summarizing the findings, the results reported by Bonnet et al. (2009) have been rescored so that a high score denotes a positive situation (i.e., little peer victimization). These authors report significantly more victimization in the category of large schools (over 500 students). Lee and Loeb (2000) report significantly more positive teacher attitudes about responsibility for student learning in small schools (less than 400 students). Based on the standardized averages per category, a quadratic function has been estimated to denote the relation between school size and noncognitive outcomes in these two studies. Holas and Huston have analyzed the linear relation between school size and three noncognitive outcomes (student perceived self-competence, school involvement in grade 5, and in grade 6). Only the relation between size and involvement in grade 6 was found to be significant. The predicted scores presented in Appendix 4.4 denote the averages across these three outcomes. The study by Koth et al. (2008) focuses on achievement motivation and student-reported order and discipline. The relation between school size and order and discipline is not significant but they found a significantly negative relation between school size and achievement motivation. In Appendix 4.4, the averages across both outcomes are reported. Duran-Narucki focused on attendance and found significantly higher attendance in large schools (see Appendix 4.5). This is the only study on noncognitive outcomes in primary education that shows positive effects when schools are large.

The weighted average in Appendix 4.4 suggests a somewhat stronger effect of school size on noncognitive student outcomes in primary education as compared to achievement scores (see Table 4.4). The difference between primary schools with 200 versus 600 students is 0.13 standard deviation. With regard to student achievement scores, the difference between schools with 200 versus 600 students equals 0.076 standard deviation. Appendix 4.5 reports the predicted standardized school means per school size. The effect of school size looks stronger when standardized relative to standard deviation among school means. However, the standardization applied in Appendix 4.4 must be considered more appropriate. Graphic displays of the findings on the relation between school size and noncognitive outcomes in primary education are provided in Figs. 4.5 and 4.6.

Fig. 4.5
figure 5

Predicted non-cognitive STUDENT outcomes per school size (primary education). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

Fig. 4.6
figure 6

Predicted non-cognitive MEAN SCHOOL outcomes per school size (primary education). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

4.8 School Size and Noncognitive Outcomes in Secondary Education

A relatively large number of studies provide details on the predicted level of noncognitive outcomes per school size in secondary education. Table 4.8 provides basic information about these studies. The total number of studies is 19, but the study by Kirkpatrick Johnson et al. (2001) reports separate findings for middle schools (grades 7 and 8) and high schools (grades 7–12). As a result, the number of samples thus equals 20.

Table 4.8 Studies on the effect of school size on noncognitive outcomes in secondary education

Twelve samples focus on the relation of school size with student outcomes and seventeen on the relation with school mean scores. Nine samples provide information on both student outcomes and school mean scores. Most research derives from the USA, but seven studies relate to other countries (two Israeli, two Dutch, the remaining three from Australia, Italy, and Taiwan). Many studies focus on the occurrence of incidents and other undesirable phenomena (such as harassment, disorder, theft, vandalism). All outcomes have been rescored in such a way that low scores denote a negative situation (e.g., high frequencies of vandalism and theft or low levels of safety or involvement). In most studies school size is modeled as a continuous variable. Only five studies make use of school size categories (Bowen et al. 2000; Chen 2008; Chen and Vazsonyi 2013; Dee et al. 2007; Rumberger and Palardy 2005). In the remaining 15 samples, the relation between school size and noncognitive outcomes is mostly modeled as a linear function, but in three cases (Gottfredson and DiPietro 2011; McNeely et al. 2002; Payne 2012) the researchers modelled it as a log-linear function (i.e., outcomes were regressed on the log of school size).

As shown in Table 4.8, many studies on noncognitive outcomes relate to multiple outcome measures. In these cases, the average effect of school size across the outcome measures involved has been computed. These are the outcomes reported in Appendices 6a–c and the corresponding figures.

4.9 School Size and Noncognitive Student Outcomes

Appendix 4.6a presents the findings for the American studies that focus on student outcomes. Appendix 4.6b reports the findings for the non-U.S. studies. The averages across studies (overall and broken down for American and non-U.S. samples) are reported in Appendix 4.6c. Graphic representations of the results are provided in the Figs. 4.7, 4.8 and 4.9.

Fig. 4.7
figure 7

Predicted non-cognitive STUDENT outcomes per school size (secondary education; American Studies). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

Fig. 4.8
figure 8

Predicted non-cognitive STUDENT outcomes per school size (secondary education; non-U.S. studies). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

Fig. 4.9
figure 9

Average outcomes non-cognitive STUDENT scores per school size (secondary education)

For three out of the five American samples negative and significant effects on noncognitive outcomes are reported. The study by Gottfredson and DiPietro (2011) has come up with significantly positive effects. Kirkpatrick Johnson et al. (2001) report nonsignificant effects for their sample that focuses on students in middle schools. The strongest effect is reported in the study by Bowen et al. (2000), which reports a difference of about half a standard deviation between the smallest and the largest schools. School size ranges in this study from less than 100 students to nearly 1,400. The outcome measures relate to school satisfaction, safety, and teacher support.

Whereas the American findings mostly show negative effects of large school size on noncognitive student outcomes in secondary education, research conducted outside the U.S. fails to confirm this picture. Appendix 4.6b presents the results from six studies conducted outside the U.S. Of these, three show a negative effect of large school size, but the other three show a positive effect. Two of the negative effects are statistically significant (Attar-Schwartz 2009; Van der Vegt et al. 2005). Only one of the reported positive effects is significant (Mooij et al. 2011). All of these three studies relate to various aspects of school safety. Two of these studies were conducted in the Netherlands. Both reports show significant effects, but in different directions. The finding reported by Vieno et al. (2005) for Italy deserves special mention. The effect in this study appears to be particularly strong, without reaching statistical significance. Perhaps the strong effect is due to over-fitting, as the number of explanatory variables at the school level is quite large relative to the number of schools.

The general picture on the relation between school size and noncognitive outcomes at the student level across all twelve samples is provided in Appendix 4.6c and Fig. 4.9. The overall effects of school size on noncognitive student outcomes appear to be quite modest, but findings from the U.S. versus outside the U.S. contradict each other. The average effect in American studies is slightly negative, whereas studies form other countries (Israel, Italy, the Netherlands, and Taiwan) show on average a positive effect of school size. Even when the findings from the study by Vieno et al. (2005) are excluded from the summary, the effect of school size remains positive. However, the effect becomes considerably smaller in that case. School size effects on noncognitive student outcomes must be described as small. The difference between predicted scores in schools with 300 versus 1,100 students is about 0.06 of standard deviation (positive or negative). The findings that relate to the U.S. suggest a negative effect of large school size, but this average effect is even smaller than the positive effects found in other countries.

4.10 School Size and Noncognitive School Mean Scores

The findings that relate to the relation between school size and standardized school mean scores largely replicate the findings on student outcomes. The main difference is that the effect on school mean scores appears to be stronger. This is basically a statistical artifact as the variation in school means is bound to be smaller than the variation between student scores. Again we see negative, but relatively small effects of large school size in the USA, while a reverse picture emerges from non-U.S. research. More details are provided in Appendices 7a–c and Figs. 4.10, 4.11, and 4.12 provide graphic illustrations of the trends described.

Fig. 4.10
figure 10

Predicted non-cognitive SCHOOL MEAN outcomes per school size (secondary education; American Studies). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

Fig. 4.11
figure 11

Predicted non-cognitive SCHOOL MEAN outcomes per school size (secondary education; non-U.S. studies). The thin black lines represent the findings for a particular study; the bold grey line denotes the weighted average across studies

Fig. 4.12
figure 12

Average outcomes non-cognitive STUDENT scores per school size (secondary education)

4.11 Conclusion

The research synthesis presented in this chapter was aimed at a precise specification of the relationship between school size and outcomes (both cognitive and noncognitive) in primary and secondary education. The predicted level of standardized outcomes given a certain school size was calculated for dozens of samples, based on the information provided in reports on the effects of school size. The discussion of the findings will focus on results related to outcomes that are standardized through division by the standard deviation in student scores. The alternative (division by the standard deviation in school means) is considered as less appropriate. It is bound to produce results that appear to reveal stronger effects of school size, which is confirmed in the present report. However, this approach tends to obscure that school size is unlikely to affect variation in student outcomes within schools, whereas the bulk of the variation in student scores (cognitive and noncognitive) is situated within schools.

On an average the review shows a slightly negative relation in primary education between school sizes both for cognitive and noncognitive outcomes. It should be noted that this finding is almost exclusively based on American research. The difference in predicted scores between very small and large schools is less than one tenth of a standard deviation for cognitive outcomes and somewhat larger (0.13 standard deviation) for noncognitive outcomes. Taken into account that the difference between the smallest and the largest schools amount at least to two standard deviations, it is clear that the effect of school size in terms of a standardized effect size (e.g., Cohen’s d) must be very modest. For noncognitive outcomes, it may still exceed the (very modest) value of 0.05, but for cognitive outcomes the effect is even weaker.

For cognitive outcomes secondary education, a curvilinear pattern emerged from the studies reviewed. The highest scores appear to occur in schools with over 1,200 students but less than 1,600 students. In larger schools, lower scores are found, but the lowest scores are predicted for schools with less than 700 students. The difference between the lowest scoring schools (400 students) and the highest scoring (1,350–1,500 students) is just over one-tenth of a standard deviation. Because the relation between school size and outcomes does not always fit into a linear pattern, it is difficult to express it in more current metrics like Cohen’s d, or a correlation coefficient. The difference between the highest scoring schools (i.e., medium to large) and small schools is probably less than one tenth of a standard deviation, which would commonly be considered a small effect (i.e., Cohen’s d < 0.20). This assessment is based on the supposition that the difference in size between very small and medium to large schools (approximately 1,000 students) accounts for atleast one standard deviation.Footnote 3 The findings on cognitive outcomes are exclusively based on research conducted in the U.S.

With regard to research on the relation between school size and noncognitive outcomes in secondary education a large part of the results relate to studies from other countries as well. Interestingly, clearly opposite trends are apparent in American studies versus studies from other countries. Across all studies the trend is slightly in favor of large schools. The difference between small secondary schools (300 students) and large ones (1,100 students) amounts to 0.06 standard deviation, but for American studies the trend is reversed. Small schools show more favorable scores, although the difference between small and large American schools turns out to be very modest (0.04 standard deviation). The effect of school size in non-U.S. studies is somewhat stronger and reversed (showing more positive scores in large schools).